specless.automaton.mps.BMPS_exact

specless.automaton.mps.BMPS_exact(symbols: List[int], M: ndarray, S: ndarray, F: ndarray, d: int, empty_symbol: int, min_string_prob: float, max_string_length: int, num_strings_to_find: int = 1, depth_first: bool = False, add_entropy: bool = True, disable_pbar: bool = False) Tuple[Iterable[Hashable], float, List[Tuple[float, Tuple[Iterable[Hashable], Iterable[float]]]]][source]

Finds the bounded, most probable string(s) (MPS) in a stochastically weighted finite automaton (SWFA).

Automaton MUST have edge weights as transition probabilities, but the outgoing transition weights don’t necessarily need to add up to 1, i.e. the transition matrices don’t need to be formal Stochastic Matrices.

This is useful if the automaton is a product, and thus its MPS can be projected onto its constituent automaton and have the same probability in its constituent automaton.

Will default to return only the MOST probable, viable string thus far, but this algorithm generalizes to return the num_strings_to_find viable strings, decreasingly sorted by their probabilities.

Originally written as BMPS_exact in: “The most probable string: an algorithmic study” by de la Higuera et. al

Parameters:
  • symbols – All symbol indices in the symbol map

  • M – a (d x d x num_symbols) tensor containing the probabilistically weighted (NOT NECESSARILY stochastic) transition matrices representing the automaton, keyed on the third index - i.e. by symbol

  • S – a (1 x d) vector containing the initial state probabilities

  • F – a (d x 1) vector containing the final state probabilities

  • d – the number of states in the automaton

  • empty_symbol – The “empty” symbol

  • min_string_prob – The minimum string probability

  • max_string_length – The maximum string length

  • num_strings_to_find – The number of viable strings to return. Defaults to only return the ONE, highest probability string encountered thus far in the search, which means the algorithm is the original BMPS_exact. If >1, then the algorithm returns the num_strings_to_find most probable, viable strings from the search heap.

  • depth_first – Whether to explore the automaton using a depth-first search pattern. Using a depth-first search pattern will be faster for very deep, tree-shaped automaton, but will not return the absolute best symbol sequence for the given min_string_prob and max_string_length. Only turn on if you have a terminal states deep in the automaton and you need the search to be faster.

  • add_entropy – Only keeps a new viable string if it has a previously unseen probability of being generated

  • disable_pbar – Disable pbar for speeding up the computation speed.

Returns:

(most probable word in the SWFA, it’s probability, num_strings_to_find viable strings in a max heap container)