Exemplos de MapReduceOper em Java

Linguagem de programação: Java

Espaço para nome / nome do pacote: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer

Classe / Tipo: MapReduceOper

Exemplos em hotexamples.com: 2

MapReduceOper em Java - 2 exemplos encontrados. Esses são os exemplos do mundo real mais bem avaliados de org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper em Java extraídos de projetos de código aberto. Você pode avaliar os exemplos para nos ajudar a melhorar a qualidade deles.

Métodos Frequentes

Exibir Ocultar

getRequestedParallelism(2)

getQuantFile(1)

isGlobalSort(1)

isSkewedJoin(1)

Métodos Frequentes

getRequestedParallelism (2)

getQuantFile (1)

isGlobalSort (1)

isSkewedJoin (1)

Relacionados

AppController

CharacterObject

AnimationThread

DataOutputPlus

Logger

Map

PhysicsUnit

Map

BlowfishUtils

Related in langs

CustomersNewest (PHP)

Controller (PHP)

EmployeeAddress (C#)

Game1 (C#)

RSTRING_PTR (C++)

second (C++)

DurationFlag (Go)

AddReversedEdge (Go)

read_widths (Python)

load (Python)

Exemplo n.º 1

0

Exibir arquivo

Arquivo: TestJobSubmission.java Projeto: scr/pig

@Test public void testDefaultParallelInSkewJoin() throws Throwable { // default_parallel is considered only at runtime, so here we only test requested parallel // more thorough tests can be found in TestNumberOfReducers.java String query = "a = load 'input';" + "b = load 'input';" + "c = join a by $0, b by $0 using 'skewed' parallel 100;" + "store c into 'output';"; PigServer ps = new PigServer(cluster.getExecType(), cluster.getProperties()); PhysicalPlan pp = Util.buildPp(ps, query); MROperPlan mrPlan = Util.buildMRPlan(pp, pc); // Get the skew join job Iterator<MapReduceOper> iter = mrPlan.getKeys().values().iterator(); int counter = 0; while (iter.hasNext()) { MapReduceOper op = iter.next(); counter++; if (op.isSkewedJoin()) { assertTrue(op.getRequestedParallelism() == 100); } } assertEquals(3, counter); pc.defaultParallel = -1; }

Exemplo n.º 2

0

Exibir arquivo

Arquivo: TestJobSubmission.java Projeto: scr/pig

@Test public void testReducerNumEstimationForOrderBy() throws Exception { // Skip the test for Tez. Tez use a different mechanism. // Equivalent test is in TestTezAutoParallelism Assume.assumeTrue("Skip this test for TEZ", Util.isMapredExecType(cluster.getExecType())); // use the estimation pc.getProperties().setProperty("pig.exec.reducers.bytes.per.reducer", "100"); pc.getProperties().setProperty("pig.exec.reducers.max", "10"); String query = "a = load '/passwd';" + "b = order a by $0;" + "store b into 'output';"; PigServer ps = new PigServer(cluster.getExecType(), cluster.getProperties()); PhysicalPlan pp = Util.buildPp(ps, query); MROperPlan mrPlan = Util.buildMRPlanWithOptimizer(pp, pc); Configuration conf = ConfigurationUtil.toConfiguration(pc.getProperties()); JobControlCompiler jcc = new JobControlCompiler(pc, conf); JobControl jobControl = jcc.compile(mrPlan, query); assertEquals(2, mrPlan.size()); // first job uses a single reducer for the sampling Util.assertParallelValues(-1, 1, -1, 1, jobControl.getWaitingJobs().get(0).getJobConf()); // Simulate the first job having run so estimation kicks in. MapReduceOper sort = mrPlan.getLeaves().get(0); jcc.updateMROpPlan(jobControl.getReadyJobs()); FileLocalizer.create(sort.getQuantFile(), pc); jobControl = jcc.compile(mrPlan, query); sort = mrPlan.getLeaves().get(0); long reducer = Math.min( (long) Math.ceil(new File("test/org/apache/pig/test/data/passwd").length() / 100.0), 10); assertEquals(reducer, sort.getRequestedParallelism()); // the second job estimates reducers Util.assertParallelValues( -1, -1, reducer, reducer, jobControl.getWaitingJobs().get(0).getJobConf()); // use the PARALLEL key word, it will override the estimated reducer number query = "a = load '/passwd';" + "b = order a by $0 PARALLEL 2;" + "store b into 'output';"; pp = Util.buildPp(ps, query); mrPlan = Util.buildMRPlanWithOptimizer(pp, pc); assertEquals(2, mrPlan.size()); sort = mrPlan.getLeaves().get(0); assertEquals(2, sort.getRequestedParallelism()); // the estimation won't take effect when it apply to non-dfs or the files doesn't exist, such as // hbase query = "a = load 'hbase://passwd' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('c:f1 c:f2');" + "b = order a by $0 ;" + "store b into 'output';"; pp = Util.buildPp(ps, query); mrPlan = Util.buildMRPlanWithOptimizer(pp, pc); assertEquals(2, mrPlan.size()); sort = mrPlan.getLeaves().get(0); // the requested parallel will be -1 if users don't set any of default_parallel, paralllel // and the estimation doesn't take effect. MR framework will finally set it to 1. assertEquals(-1, sort.getRequestedParallelism()); // test order by with three jobs (after optimization) query = "a = load '/passwd';" + "b = foreach a generate $0, $1, $2;" + "c = order b by $0;" + "store c into 'output';"; pp = Util.buildPp(ps, query); mrPlan = Util.buildMRPlanWithOptimizer(pp, pc); assertEquals(3, mrPlan.size()); // Simulate the first 2 jobs having run so estimation kicks in. sort = mrPlan.getLeaves().get(0); FileLocalizer.create(sort.getQuantFile(), pc); jobControl = jcc.compile(mrPlan, query); Util.copyFromLocalToCluster( cluster, "test/org/apache/pig/test/data/passwd", ((POLoad) sort.mapPlan.getRoots().get(0)).getLFile().getFileName()); // First job is just foreach with projection, mapper-only job, so estimate gets ignored Util.assertParallelValues(-1, -1, -1, 0, jobControl.getWaitingJobs().get(0).getJobConf()); jcc.updateMROpPlan(jobControl.getReadyJobs()); jobControl = jcc.compile(mrPlan, query); jcc.updateMROpPlan(jobControl.getReadyJobs()); // Second job is a sampler, which requests and gets 1 reducer Util.assertParallelValues(-1, 1, -1, 1, jobControl.getWaitingJobs().get(0).getJobConf()); jobControl = jcc.compile(mrPlan, query); sort = mrPlan.getLeaves().get(0); assertEquals(reducer, sort.getRequestedParallelism()); // Third job is the order, which uses the estimated number of reducers Util.assertParallelValues( -1, -1, reducer, reducer, jobControl.getWaitingJobs().get(0).getJobConf()); }