Java Counter 예제들

프로그래밍 언어: Java

네임스페이스/패키지 이름: edu.jhu.hlt.parma.util

클래스/타입: Counter

hotexamples.com에서의 예제들: 10

Java Counter - 10개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Java의 edu.jhu.hlt.parma.util.Counter에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

getCount(5)

keySet(5)

containsKey(2)

incrementCount(2)

totalCount(2)

setCount(1)

예제 #1

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

 public double getCount(K token) {
   if (!lm.keySet().contains(token)) {
     System.err.println(lm.keySet().size());
     throw new RuntimeException("token not in keyset");
   }
   return lm.getCount(token);
 }

예제 #2

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

  /**
   * GT smoothing with least squares interpolation. This follows the procedure in Jurafsky and
   * Martin sect. 4.5.3.
   */
  public void smoothAndNormalize() {
    Counter<Integer> cntCounter = new Counter<Integer>();
    for (K tok : lm.keySet()) {
      int cnt = (int) lm.getCount(tok);
      cntCounter.incrementCount(cnt);
    }

    final double[] coeffs = runLogSpaceRegression(cntCounter);

    UNK_PROB = cntCounter.getCount(1) / lm.totalCount();

    for (K tok : lm.keySet()) {
      double tokCnt = lm.getCount(tok);
      if (tokCnt <= unkCutoff) // Treat as unknown
      unkTokens.add(tok);
      if (tokCnt <= kCutoff) { // Smooth
        double cSmooth = katzEstimate(cntCounter, tokCnt, coeffs);
        lm.setCount(tok, cSmooth);
      }
    }

    // Normalize
    // Counters.normalize(lm);
    // MY COUNTER IS ALWAYS NORMALIZED AND AWESOME
  }

예제 #3

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

  private double[] runLogSpaceRegression(Counter<Integer> cntCounter) {
    SimpleRegression reg = new SimpleRegression();

    for (int cnt : cntCounter.keySet()) {
      reg.addData(cnt, Math.log(cntCounter.getCount(cnt)));
    }

    // System.out.println(reg.getIntercept());
    // System.out.println(reg.getSlope());
    // System.out.println(regression.getSlopeStdErr());

    double[] coeffs = new double[] {reg.getIntercept(), reg.getSlope()};

    return coeffs;
  }

예제 #4

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

  private double katzEstimate(Counter<Integer> cnt, double c, double[] coeffs) {
    double nC = cnt.getCount((int) c);
    double nC1 = cnt.getCount(((int) c) + 1);
    if (nC1 == 0.0) nC1 = Math.exp(coeffs[0] + (coeffs[1] * (c + 1.0)));

    double n1 = cnt.getCount(1);
    double nK1 = cnt.getCount(((int) kCutoff) + 1);
    if (nK1 == 0.0) nK1 = Math.exp(coeffs[0] + (coeffs[1] * (kCutoff + 1.0)));

    double kTerm = (kCutoff + 1.0) * (nK1 / n1);
    double cTerm = (c + 1.0) * (nC1 / nC);

    double cSmooth = (cTerm - (c * kTerm)) / (1.0 - kTerm);

    return cSmooth;
  }

예제 #5

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

 public Set<K> getVocab() {
   return Collections.unmodifiableSet(lm.keySet());
 }

예제 #6

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

 public boolean contains(K token) {
   return lm.containsKey(token);
 }

예제 #7

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

 public int vocabSize() {
   return lm.keySet().size();
 }

예제 #8

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

 public double totalMass() {
   return lm.totalCount();
 }

예제 #9

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

 public double getProb(K token) {
   if (unkTokens.contains(token) || !lm.containsKey(token)) return UNK_PROB;
   return lm.getCount(token);
 }

예제 #10

0

파일 보기

파일: UnigramLM.java 프로젝트: hltcoe/parma

 public void incrementCount(K token) {
   lm.incrementCount(token);
 }