Java Common.trimAll 예제들

프로그래밍 언어: Java

네임스페이스/패키지 이름: ir.ac.itrc.qqa.semantic.util

클래스/타입: Common

메소드/함수: trimAll

hotexamples.com에서의 예제들: 1

Java Common.trimAll - 1개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Java의 ir.ac.itrc.qqa.semantic.util.Common.trimAll에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

log(2)

canonicalizeString(1)

logInline(1)

openFileForWriting(1)

printInline(1)

removeDiacritic(1)

removeParenthesis(1)

removeParenthesisWithException(1)

removePunctuations(1)

trimAll(1)

예제 #1

파일 보기

파일: Common.java 프로젝트: hasheminamin/HPR

  /**
   * A simple string normalizer
   *
   * @param text input text
   * @return normalized text
   */
  public static String normalizeNotTokenized(String text) {
    // TODO: some concpets have '\r\n' and need them. find a way to remove 'replace("\r", "
    // ").replace("\n", " ")'. known issues if do so: permamnet concept ids file
    text =
        text.replace("ك", "ک")
            .replace("ي", "ی")
            .replace("ى", "ی")
            .replace("\r", " ")
            .replace("\n", " ");

    text =
        text.replace("ي", "ی")
            .replace("ی", "ی")
            .replace("ى", "ی")
            .replace("ك", "ک")
            .replace("ک", "ک");

    text =
        text.replaceAll(
            String.valueOf(Character.toChars(8203)), new String(Character.toChars(8204)));
    text =
        text.replaceAll(String.valueOf(Character.toChars(1609)), "ی"); // arabic letter ye maksura

    text = replaceCorresponding(text, "۰۱۲۳۴۵۶۷۸۹", "0123456789");
    text = replaceCorresponding(text, "٠١٢٣٤٥٦٧٨٩", "0123456789");

    // correcting punctuation spacings, commented as it contradicts the tokenizer's output
    // text = text.replaceAll(" ([;,،؛:])", "$1 ");
    // text = text.replaceAll("\\(", " \\(");
    // text = text.replaceAll("\\)", "\\) ");

    text = text.replace("  ", " ");
    text = Common.trimAll(text, "\" \u200C");

    return text;
  }