Natural Language Processing (NLP)
Natural language processing functions for OpenAI API-compatible providers.
- smarter.apps.plugin.nlp.clean_prompt(prompt)[source]
Clean up a prompt by inserting spaces before capital letters.
This function transforms concatenated or camel-cased words into a more readable format by adding spaces before capital letters, except for names starting with “Mc”. Useful for improving prompt clarity in NLP tasks.
- Parameters:
prompt (
str) – The input string to clean.- Returns:
The cleaned string with spaces before capital letters.
- Return type:
Note
This is a simple heuristic and may not handle all edge cases perfectly. For example, names starting with “Mc” (e.g., “McDaniel”) are not split.
Tip
Use this function to preprocess user input or model prompts for better readability.
See also
Example usage:
from smarter.apps.plugin.nlp import clean_prompt s = "WhoIsLawrenceMcDaniel" print(clean_prompt(s)) # Output: "Who Is Lawrence McDaniel"
- smarter.apps.plugin.nlp.does_refer_to(prompt, search_term, threshold=3)[source]
Check if the prompt refers to the given string.
This function determines whether a prompt refers to a target string by first cleaning the prompt, then performing both direct and fuzzy matching. It uses
simple_search()for exact or token-based matches, andwithin_levenshtein_distance()for typo-tolerant fuzzy matches.- Parameters:
- Returns:
True if the prompt refers to the search term, otherwise False.
- Return type:
Important
This function combines both exact and fuzzy matching for robust reference detection.
Tip
Adjust the threshold parameter for stricter or looser fuzzy matching.
Example usage:
from smarter.apps.plugin.nlp import does_refer_to prompt = "WhoIsLawranceMcDaniel" print(does_refer_to(prompt, "Lawrence McDaniel")) # True print(does_refer_to(prompt, "John Doe")) # False
- smarter.apps.plugin.nlp.lower_case_splitter(string_of_words)[source]
Split a string on spaces and return a list of lowercase words.
This function tokenizes a string by spaces and converts each token to lowercase. Useful for case-insensitive text processing, search, and normalization in NLP tasks.
- Parameters:
string_of_words (
str) – The input string to split and lowercase.- Returns:
List of lowercase words.
- Return type:
Tip
Use this function to prepare text for matching, searching, or comparison.
See also
Example usage:
from smarter.apps.plugin.nlp import lower_case_splitter s = "The Quick Brown Fox" print(lower_case_splitter(s)) # Output: ['the', 'quick', 'brown', 'fox']
- smarter.apps.plugin.nlp.simple_search(prompt, search_term)[source]
Check if the prompt contains the target string.
This function performs a case-insensitive search for the search_term within the prompt. It also checks if all tokens in the search term appear in the prompt, regardless of order.
- Parameters:
- Returns:
True if the search term is found in the prompt, otherwise False.
- Return type:
Tip
Use this function for simple keyword or phrase matching in user prompts or text analysis.
Caution
This function does not perform fuzzy matching or handle typos. For more advanced matching, consider using
within_levenshtein_distance().Example usage:
from smarter.apps.plugin.nlp import simple_search prompt = "Find all weather plugins for New York" print(simple_search(prompt, "weather plugins")) # True print(simple_search(prompt, "Weather")) # True print(simple_search(prompt, "California")) # False
- smarter.apps.plugin.nlp.within_levenshtein_distance(prompt, search_term, threshold=3)[source]
Check if the prompt is within the given Levenshtein distance of the target string.
This function compares each title-cased word in the prompt to the search_term using the Levenshtein distance metric. If any word is within the specified threshold, the function returns True. Useful for fuzzy matching and typo-tolerant search.
- Parameters:
- Returns:
True if a word in the prompt is within the threshold distance of the search term, otherwise False.
- Return type:
Tip
Adjust the threshold parameter to control the strictness of fuzzy matching.
Caution
Only title-cased words in the prompt are considered for comparison.
Example usage:
from smarter.apps.plugin.nlp import within_levenshtein_distance prompt = "Find all plugins for Lawrance McDaniel" print(within_levenshtein_distance(prompt, "Lawrence", threshold=2)) # True