Natural Language Processing (NLP)

Natural language processing functions for OpenAI API-compatible providers.

smarter.apps.plugin.nlp.clean_prompt(prompt)[source]

Clean up a prompt by inserting spaces before capital letters.

This function transforms concatenated or camel-cased words into a more readable format by adding spaces before capital letters, except for names starting with “Mc”. Useful for improving prompt clarity in NLP tasks.

Parameters:

prompt (str) – The input string to clean.

Returns:

The cleaned string with spaces before capital letters.

Return type:

str

Note

This is a simple heuristic and may not handle all edge cases perfectly. For example, names starting with “Mc” (e.g., “McDaniel”) are not split.

Tip

Use this function to preprocess user input or model prompts for better readability.

Example usage:

from smarter.apps.plugin.nlp import clean_prompt

s = "WhoIsLawrenceMcDaniel"
print(clean_prompt(s))
# Output: "Who Is Lawrence McDaniel"
smarter.apps.plugin.nlp.does_refer_to(prompt, search_term, threshold=3)[source]

Check if the prompt refers to the given string.

This function determines whether a prompt refers to a target string by first cleaning the prompt, then performing both direct and fuzzy matching. It uses simple_search() for exact or token-based matches, and within_levenshtein_distance() for typo-tolerant fuzzy matches.

Parameters:
  • prompt (str) – The input string to analyze.

  • search_term (str) – The target string to check for reference.

  • threshold (int) – The maximum Levenshtein distance for fuzzy matching (default: 3).

Returns:

True if the prompt refers to the search term, otherwise False.

Return type:

bool

Important

This function combines both exact and fuzzy matching for robust reference detection.

Tip

Adjust the threshold parameter for stricter or looser fuzzy matching.

Example usage:

from smarter.apps.plugin.nlp import does_refer_to

prompt = "WhoIsLawranceMcDaniel"
print(does_refer_to(prompt, "Lawrence McDaniel"))  # True
print(does_refer_to(prompt, "John Doe"))           # False
smarter.apps.plugin.nlp.lower_case_splitter(string_of_words)[source]

Split a string on spaces and return a list of lowercase words.

This function tokenizes a string by spaces and converts each token to lowercase. Useful for case-insensitive text processing, search, and normalization in NLP tasks.

Parameters:

string_of_words (str) – The input string to split and lowercase.

Returns:

List of lowercase words.

Return type:

list

Tip

Use this function to prepare text for matching, searching, or comparison.

Example usage:

from smarter.apps.plugin.nlp import lower_case_splitter

s = "The Quick Brown Fox"
print(lower_case_splitter(s))
# Output: ['the', 'quick', 'brown', 'fox']

Check if the prompt contains the target string.

This function performs a case-insensitive search for the search_term within the prompt. It also checks if all tokens in the search term appear in the prompt, regardless of order.

Parameters:
  • prompt (str) – The input string to search within.

  • search_term (str) – The target string or phrase to look for.

Returns:

True if the search term is found in the prompt, otherwise False.

Return type:

bool

Tip

Use this function for simple keyword or phrase matching in user prompts or text analysis.

Caution

This function does not perform fuzzy matching or handle typos. For more advanced matching, consider using within_levenshtein_distance().

Example usage:

from smarter.apps.plugin.nlp import simple_search

prompt = "Find all weather plugins for New York"
print(simple_search(prompt, "weather plugins"))  # True
print(simple_search(prompt, "Weather"))          # True
print(simple_search(prompt, "California"))       # False
smarter.apps.plugin.nlp.within_levenshtein_distance(prompt, search_term, threshold=3)[source]

Check if the prompt is within the given Levenshtein distance of the target string.

This function compares each title-cased word in the prompt to the search_term using the Levenshtein distance metric. If any word is within the specified threshold, the function returns True. Useful for fuzzy matching and typo-tolerant search.

Parameters:
  • prompt (str) – The input string to search within.

  • search_term (str) – The target string to compare against.

  • threshold (int) – The maximum allowed Levenshtein distance for a match (default: 3).

Returns:

True if a word in the prompt is within the threshold distance of the search term, otherwise False.

Return type:

bool

Tip

Adjust the threshold parameter to control the strictness of fuzzy matching.

Caution

Only title-cased words in the prompt are considered for comparison.

Example usage:

from smarter.apps.plugin.nlp import within_levenshtein_distance

prompt = "Find all plugins for Lawrance McDaniel"
print(within_levenshtein_distance(prompt, "Lawrence", threshold=2))  # True