public void map( LongWritable key, Text t, OutputCollector<IntWritable, PageRankNode> output, Reporter reporter) throws IOException { String[] arr = t.toString().trim().split("\\s+"); nid.set(Integer.parseInt(arr[0])); if (arr.length == 1) { node.setNodeId(Integer.parseInt(arr[0])); node.setAdjacencyList(new ArrayListOfIntsWritable()); } else { node.setNodeId(Integer.parseInt(arr[0])); int[] neighbors = new int[arr.length - 1]; for (int i = 1; i < arr.length; i++) { neighbors[i - 1] = Integer.parseInt(arr[i]); } node.setAdjacencyList(new ArrayListOfIntsWritable(neighbors)); } reporter.incrCounter("graph", "numNodes", 1); reporter.incrCounter("graph", "numEdges", arr.length - 1); if (arr.length > 1) { reporter.incrCounter("graph", "numActiveNodes", 1); } output.collect(nid, node); }
@Override public void map(IntWritable nid, PageRankNode node, Context context) throws IOException, InterruptedException { float p = node.getPageRank(); float jump = (float) (Math.log(ALPHA) - Math.log(nodeCnt)); float link = (float) Math.log(1.0f - ALPHA) + sumLogProbs(p, (float) (Math.log(missingMass) - Math.log(nodeCnt))); p = sumLogProbs(jump, link); node.setPageRank(p); context.write(nid, node); }
@Override public void cleanup(Context context) throws IOException, InterruptedException { // Now emit the messages all at once. IntWritable k = new IntWritable(); PageRankNode mass = new PageRankNode(); for (MapIF.Entry e : map.entrySet()) { k.set(e.getKey()); mass.setNodeId(e.getKey()); mass.setType(PageRankNode.Type.Mass); mass.setPageRank(e.getValue()); context.write(k, mass); } }
public void write(DataOutput out) throws IOException { out.writeBoolean(isNode); if (node != null) { node.write(out); } else { contribution.write(out); } }
public void readFields(DataInput in) throws IOException { isNode = in.readBoolean(); if (isNode) { node = new PageRankNode(); node.readFields(in); } else { contribution = new Contribution(); contribution.readFields(in); } }
@Override public void reduce(IntWritable nid, Iterable<PageRankNode> values, Context context) throws IOException, InterruptedException { int massMessages = 0; // Remember, PageRank mass is stored as a log prob. float mass = Float.NEGATIVE_INFINITY; for (PageRankNode n : values) { if (n.getType() == PageRankNode.Type.Structure) { // Simply pass along node structure. context.write(nid, n); } else { // Accumulate PageRank mass contributions. mass = sumLogProbs(mass, n.getPageRank()); massMessages++; } } // Emit aggregated results. if (massMessages > 0) { intermediateMass.setNodeId(nid.get()); intermediateMass.setType(PageRankNode.Type.Mass); intermediateMass.setPageRank(mass); context.write(nid, intermediateMass); } }
@Override public void map(IntWritable nid, PageRankNode node, Context context) throws IOException, InterruptedException { // Pass along node structure. intermediateStructure.setNodeId(node.getNodeId()); intermediateStructure.setType(PageRankNode.Type.Structure); intermediateStructure.setAdjacencyList(node.getAdjacenyList()); context.write(nid, intermediateStructure); int massMessages = 0; int massMessagesSaved = 0; // Distribute PageRank mass to neighbors (along outgoing edges). if (node.getAdjacenyList().size() > 0) { // Each neighbor gets an equal share of PageRank mass. ArrayListOfIntsWritable list = node.getAdjacenyList(); float mass = node.getPageRank() - (float) StrictMath.log(list.size()); context.getCounter(PageRank.edges).increment(list.size()); // Iterate over neighbors. for (int i = 0; i < list.size(); i++) { int neighbor = list.get(i); if (map.containsKey(neighbor)) { // Already message destined for that node; add PageRank mass contribution. massMessagesSaved++; map.put(neighbor, sumLogProbs(map.get(neighbor), mass)); } else { // New destination node; add new entry in map. massMessages++; map.put(neighbor, mass); } } } // Bookkeeping. context.getCounter(PageRank.nodes).increment(1); context.getCounter(PageRank.massMessages).increment(massMessages); context.getCounter(PageRank.massMessagesSaved).increment(massMessagesSaved); }
public void configure(JobConf job) { int n = job.getInt("NodeCnt", 0); node.setType(PageRankNode.TYPE_COMPLETE); node.setPageRank((float) -StrictMath.log(n)); }
@Override public void map(IntWritable nid, PageRankNode node, Context context) throws IOException, InterruptedException { // Pass along node structure. intermediateStructure.setNodeId(node.getNodeId()); intermediateStructure.setType(PageRankNode.Type.Structure); intermediateStructure.setAdjacencyList(node.getAdjacenyList()); context.write(nid, intermediateStructure); int massMessages = 0; // Distribute PageRank mass to neighbors (along outgoing edges). if (node.getAdjacenyList().size() > 0) { // Each neighbor gets an equal share of PageRank mass. ArrayListOfIntsWritable list = node.getAdjacenyList(); float mass = node.getPageRank() - (float) StrictMath.log(list.size()); context.getCounter(PageRank.edges).increment(list.size()); // Iterate over neighbors. for (int i = 0; i < list.size(); i++) { neighbor.set(list.get(i)); intermediateMass.setNodeId(list.get(i)); intermediateMass.setType(PageRankNode.Type.Mass); intermediateMass.setPageRank(mass); // Emit messages with PageRank mass to neighbors. context.write(neighbor, intermediateMass); massMessages++; } } // Bookkeeping. context.getCounter(PageRank.nodes).increment(1); context.getCounter(PageRank.massMessages).increment(massMessages); }
@Override public void reduce(IntWritable nid, Iterable<PageRankNode> iterable, Context context) throws IOException, InterruptedException { Iterator<PageRankNode> values = iterable.iterator(); // Create the node structure that we're going to assemble back together from shuffled pieces. PageRankNode node = new PageRankNode(); node.setType(PageRankNode.Type.Complete); node.setNodeId(nid.get()); int massMessagesReceived = 0; int structureReceived = 0; float mass = Float.NEGATIVE_INFINITY; while (values.hasNext()) { PageRankNode n = values.next(); if (n.getType().equals(PageRankNode.Type.Structure)) { // This is the structure; update accordingly. ArrayListOfIntsWritable list = n.getAdjacenyList(); structureReceived++; node.setAdjacencyList(list); } else { // This is a message that contains PageRank mass; accumulate. mass = sumLogProbs(mass, n.getPageRank()); massMessagesReceived++; } } // Update the final accumulated PageRank mass. node.setPageRank(mass); context.getCounter(PageRank.massMessagesReceived).increment(massMessagesReceived); // Error checking. if (structureReceived == 1) { // Everything checks out, emit final node structure with updated PageRank value. context.write(nid, node); // Keep track of total PageRank mass. totalMass = sumLogProbs(totalMass, mass); } else if (structureReceived == 0) { // We get into this situation if there exists an edge pointing to a node which has no // corresponding node structure (i.e., PageRank mass was passed to a non-existent node)... // log and count but move on. context.getCounter(PageRank.missingStructure).increment(1); LOG.warn( "No structure received for nodeid: " + nid.get() + " mass: " + massMessagesReceived); // It's important to note that we don't add the PageRank mass to total... if PageRank mass // was sent to a non-existent node, it should simply vanish. } else { // This shouldn't happen! throw new RuntimeException( "Multiple structure received for nodeid: " + nid.get() + " mass: " + massMessagesReceived + " struct: " + structureReceived); } }