Java GroundedAction.getTransitions 예제들

프로그래밍 언어: Java

네임스페이스/패키지 이름: burlap.oomdp.singleagent

클래스/타입: GroundedAction

메소드/함수: getTransitions

hotexamples.com에서의 예제들: 2

Java GroundedAction.getTransitions - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Java의 burlap.oomdp.singleagent.GroundedAction.getTransitions에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

executeIn(5)

getTransitions(2)

translateParameters(1)

예제 #1

파일 보기

파일: BoundedRTDP.java 프로젝트: RayMick/omscs-cs7641-machine-learning-assignment-4

  /**
   * Selects a next state for expansion when action a is applied in state s by randomly sampling
   * from the transition dynamics weighted by the margin of the lower and upper bound value
   * functions.
   *
   * @param s the source state of the transition
   * @param a the action applied in the source state
   * @return a {@link StateSelectionAndExpectedGap} object holding the next state to be expanded and
   *     the expected margin size of this transition.
   */
  protected StateSelectionAndExpectedGap getNextStateBySampling(State s, GroundedAction a) {

    List<TransitionProbability> tps = a.getTransitions(s);
    double sum = 0.;
    double[] weightedGap = new double[tps.size()];
    HashableState[] hashedStates = new HashableState[tps.size()];
    for (int i = 0; i < tps.size(); i++) {
      TransitionProbability tp = tps.get(i);
      HashableState nsh = this.hashingFactory.hashState(tp.s);
      hashedStates[i] = nsh;
      double gap = this.getGap(nsh);
      weightedGap[i] = tp.p * gap;
      sum += weightedGap[i];
    }

    double roll = RandomFactory.getMapped(0).nextDouble();
    double cumSum = 0.;
    for (int i = 0; i < weightedGap.length; i++) {
      cumSum += weightedGap[i] / sum;
      if (roll < cumSum) {
        StateSelectionAndExpectedGap select =
            new StateSelectionAndExpectedGap(hashedStates[i], sum);
        return select;
      }
    }

    throw new RuntimeException("Error: probabilities in state selection did not sum to 1.");
  }

예제 #2

파일 보기

파일: BoundedRTDP.java 프로젝트: RayMick/omscs-cs7641-machine-learning-assignment-4

  /**
   * Selects a next state for expansion when action a is applied in state s according to the next
   * possible state that has the largest lower and upper bound margin. Ties are broken randomly.
   *
   * @param s the source state of the transition
   * @param a the action applied in the source state
   * @return a {@link StateSelectionAndExpectedGap} object holding the next state to be expanded and
   *     the expected margin size of this transition.
   */
  protected StateSelectionAndExpectedGap getNextStateByMaxMargin(State s, GroundedAction a) {

    List<TransitionProbability> tps = a.getTransitions(s);
    double sum = 0.;
    double maxGap = Double.NEGATIVE_INFINITY;
    List<HashableState> maxStates = new ArrayList<HashableState>(tps.size());
    for (TransitionProbability tp : tps) {
      HashableState nsh = this.hashingFactory.hashState(tp.s);
      double gap = this.getGap(nsh);
      sum += tp.p * gap;
      if (gap == maxGap) {
        maxStates.add(nsh);
      } else if (gap > maxGap) {
        maxStates.clear();
        maxStates.add(nsh);
        maxGap = gap;
      }
    }

    int rint = RandomFactory.getMapped(0).nextInt(maxStates.size());
    StateSelectionAndExpectedGap select =
        new StateSelectionAndExpectedGap(maxStates.get(rint), sum);

    return select;
  }