Java SparkUtilities 예제들

프로그래밍 언어: Java

네임스페이스/패키지 이름: java.nio.file

클래스/타입: SparkUtilities

hotexamples.com에서의 예제들: 3

Java SparkUtilities - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Java의 java.nio.file.SparkUtilities에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

realizeAndReturn(2)

fromIterable(1)

getCurrentContext(1)

예제 #1

파일 보기

파일: SparkMapReduce.java 프로젝트: tanfei/distributed-tools

  // @Override
  public void performSourceMapReduce(JavaRDD<KeyValueObject<KEYIN, VALUEIN>> pInputs) {
    // if not commented out this line forces mappedKeys to be realized
    //    pInputs = SparkUtilities.realizeAndReturn(pInputs,getCtx());
    JavaSparkContext ctx2 = SparkUtilities.getCurrentContext();
    System.err.println("Starting Score Mapping");
    JavaPairRDD<K, Tuple2<K, V>> kkv = performMappingPart(pInputs);
    //      kkv = SparkUtilities.realizeAndReturn(kkv, ctx2);

    //        mappedKeys = mappedKeys.persist(StorageLevel.MEMORY_AND_DISK_2());
    //        // if not commented out this line forces mappedKeys to be realized
    //        mappedKeys = SparkUtilities.realizeAndReturn(mappedKeys, ctx2);
    //
    //        // convert to tuples
    //     //   JavaPairRDD<K, Tuple2<K, V>> kkv = mappedKeys.mapToPair(new KeyValuePairFunction<K,
    // V>());
    //
    //        kkv = kkv.persist(StorageLevel.MEMORY_AND_DISK_2());
    //        // if not commented out this line forces mappedKeys to be realized
    //       kkv = SparkUtilities.realizeAndReturn(kkv, ctx2);

    // if not commented out this line forces kvJavaPairRDD to be realized
    // kkv = SparkUtilities.realizeAndReturn(kkv );

    System.err.println("Starting Score Reduce");
    IReducerFunction reduce = getReduce();
    // for some reason the compiler thnks K or V is not Serializable
    JavaPairRDD<K, Tuple2<K, V>> kkv1 = kkv;

    // JavaPairRDD<? extends Serializable, Tuple2<? extends Serializable, ? extends Serializable>>
    // kkv1 = (JavaPairRDD<? extends Serializable, Tuple2<? extends Serializable, ? extends
    // Serializable>>)kkv;
    //noinspection unchecked
    JavaPairRDD<K, KeyAndValues<K, V>> reducedSets =
        (JavaPairRDD<K, KeyAndValues<K, V>>) KeyAndValues.combineByKey(kkv1);

    // if not commented out this line forces kvJavaPairRDD to be realized
    reducedSets = SparkUtilities.realizeAndReturn(reducedSets);

    PartitionAdaptor<K> prt = new PartitionAdaptor<K>(getPartitioner());
    reducedSets = reducedSets.partitionBy(prt);
    reducedSets = reducedSets.sortByKey();

    // if not commented out this line forces kvJavaPairRDD to be realized
    reducedSets = SparkUtilities.realizeAndReturn(reducedSets);

    ReduceFunctionAdaptor f = new ReduceFunctionAdaptor(ctx2, reduce);

    JavaRDD<KeyValueObject<KOUT, VOUT>> reducedOutput = reducedSets.flatMap(f);

    //  JavaPairRDD<K, V> kvJavaPairRDD = asTuples.partitionBy(sparkPartitioner);

    // if not commented out this line forces kvJavaPairRDD to be realized
    // kvJavaPairRDD = SparkUtilities.realizeAndReturn(kvJavaPairRDD,getCtx());

    // if not commented out this line forces kvJavaPairRDD to be realized
    //  reducedOutput = SparkUtilities.realizeAndReturn(reducedOutput, ctx2);

    output = reducedOutput;
  }

예제 #2

파일 보기

파일: SparkMapReduce.java 프로젝트: tanfei/distributed-tools

  // @Override
  public void performSingleReturnMapReduce(JavaRDD<KeyValueObject<KEYIN, VALUEIN>> pInputs) {
    // if not commented out this line forces mappedKeys to be realized
    //    pInputs = SparkUtilities.realizeAndReturn(pInputs,getCtx());
    JavaPairRDD<K, Tuple2<K, V>> kkv = performMappingPart(pInputs);

    // if not commented out this line forces kvJavaPairRDD to be realized
    kkv = SparkUtilities.realizeAndReturn(kkv);

    PartitionAdaptor<K> prt = new PartitionAdaptor<K>(getPartitioner());
    kkv = kkv.partitionBy(prt);

    IReducerFunction reduce = getReduce();
    /** we can guarantee one output per input */
    SingleOutputReduceFunctionAdaptor<K, V, KOUT, VOUT> f =
        new SingleOutputReduceFunctionAdaptor((ISingleOutputReducerFunction) reduce);
    JavaRDD<KeyValueObject<KOUT, VOUT>> reduced = kkv.map(f);

    // if not commented out this line forces kvJavaPairRDD to be realized
    reduced = SparkUtilities.realizeAndReturn(reduced);

    output = reduced;
  }

예제 #3

파일 보기

파일: SparkMapReduce.java 프로젝트: tanfei/distributed-tools

 /**
  * sources may be very implementation specific
  *
  * @param source some source of data - might be a hadoop directory or a Spark RDD - this will be
  *     cast internally
  * @param otherData
  */
 @Override
 public void mapReduceSource(@Nonnull final Object source, final Object... otherData) {
   if (source instanceof JavaRDD) {
     performSourceMapReduce((JavaRDD) source);
     return;
   }
   if (source instanceof Path) {
     performMapReduce((Path) source);
     return;
   }
   if (source instanceof java.lang.Iterable) {
     performSourceMapReduce(SparkUtilities.fromIterable((Iterable) source));
     return;
   }
   throw new IllegalArgumentException("cannot handle source of class " + source.getClass());
 }