icon/x Created with Sketch.

Splunk Cookie Policy

We use our own and third-party cookies to provide you with a great online experience. We also use these cookies to improve our products and services, support our marketing campaigns, and advertise to you on our website and other websites. Some cookies may continue to collect information after you have left our website. Learn more (including how to update your settings) here.
Accept Cookie Policy

We are working on something new...

A Fresh New Splunkbase
We are designing a New Splunkbase to improve search and discoverability of apps. Check out our new and improved features like Categories and Collections. New Splunkbase is currently in preview mode, as it is under active development. We welcome you to navigate New Splunkbase and give us feedback.

Accept License Agreements

This app is provided by a third party and your right to use the app is in accordance with the license provided by that third-party licensor. Splunk is not responsible for any third-party apps and does not provide any warranty or support. If you have any questions, complaints or claims with respect to this app, please contact the licensor directly.

Thank You

Downloading Fuzzy Search for Splunk
SHA256 checksum (fuzzy-search-for-splunk_2010.tgz) 3bb0463daf29dcb08fc76000a67aa8a86e5988d76ec777a32b12854c7e5ca94b SHA256 checksum (fuzzy-search-for-splunk_209.tgz) f23d4b7e1634541ad139170f604663184c8ec787a9c1e060825c1a95b98286da SHA256 checksum (fuzzy-search-for-splunk_207.tgz) c76b40dd29f097bbd4605de654a2c298465314b6d25a75e8d4340e7dcc940283 SHA256 checksum (fuzzy-search-for-splunk_206.tgz) e0d41567b6d5e451067d3c37e1b81b866d475fe5bdd39523d81cec51d6b1e296 SHA256 checksum (fuzzy-search-for-splunk_205.tgz) 34c7e44aff637793b6b84326de3fa3159e89378dbf9c8591ddd8c3970af8829b SHA256 checksum (fuzzy-search-for-splunk_204.tgz) c51277e69573b8b979ac5c137699db4b1b95a1c3736572f80ff900e873378829 SHA256 checksum (fuzzy-search-for-splunk_203.tgz) 0d19db6610102de8cdbd9dacbf7cac3c578a938f91a2095489fdc3a78374d8e4 SHA256 checksum (fuzzy-search-for-splunk_202.tgz) 0a181c09e69b682c13cd9383dc249702f5fe667c344c8ff054efa066bd1f4a7c SHA256 checksum (fuzzy-search-for-splunk_201.tgz) c05298fad8ec18abfbe79e25007c7e9006b7d8b7cac7381aa1306070beb12ce4 SHA256 checksum (fuzzy-search-for-splunk_20.tgz) e30de3b900b4668cdfa5cf5f68dff70529c0926b663ecd3246addc61725d8956 SHA256 checksum (fuzzy-search-for-splunk_122.tgz) 8a0e88419b72d8f49642cc060a90a9f9389fd2373a826982813293167baf829f SHA256 checksum (fuzzy-search-for-splunk_121.tgz) 3d33bfc926db894d27aaec902ae750d9524cd9c6a7c8f95a83b252412bcc7022 SHA256 checksum (fuzzy-search-for-splunk_12.tgz) 5e042a33c96ebe98ea592d9eb00ca465e74817c4082ea2cd70fe2ffc2bdf04d1 SHA256 checksum (fuzzy-search-for-splunk_11.tgz) 40d88f2f35c01637c6357bdafce5f2536cfa71026689415f06491baff4eb67f7 SHA256 checksum (fuzzy-search-for-splunk_10.tgz) 333c1343930f1e2350c601e7c11e620df28fee0a414cf225c03e2ac093cd9a11
To install your download
For instructions specific to your download, click the Details tab after closing this window.

Flag As Inappropriate

splunk

Fuzzy Search for Splunk

This app has been archived. Learn more about app archiving.
This app is NOT supported by Splunk. Please read about what that means for you here.
Overview
Details
The goal of this app is to provide simple fuzzy matching to find lookalikes in a given dataset. For example, you could leverage this to search against process execution logs to find fuzzy matches for "svchost.exe" to highlight executions of "svch0st.exe" or "scvhost.exe".

Change Log

1.0

  • Initial Release

1.1

  • Minor changes to try to increase performance of the script
  • Verified app continued to function with splunk 6.5

1.2

  • Added searchbnf.conf
  • Added minor error checking in case a user provides a bad delim regex

1.2.1

  • No one told me there was an error in the script and I guess I didn't test it fully. Stupid typo. :(

1.2.2

  • I changed the default behavior of the script. If you don't want to specify a delimiter, it will no longer try to split the input. If a bad delimiter is given, it will default to a newline.

2.0

  • Migrated from intersplunk to the Splunk SDK for Python.
  • Updated fuzzywuzzy library to latest release
  • Updated readme file to markdown syntax
  • Verified compatibility with Splunk 7.0
  • Set option local = True to force the command to only run on the search head
  • Made a number of workflow improvements, trying to increase command performance.
  • Now only bothers to track the maximum ratio matched instead of also tracking the minimum.

2.0.1

  • Bug fixes to support multivalue input fields again

2.0.2

  • Documentation updates based on appinspect output.

2.0.3

  • Added appicon images for compatibility with certification.

2.0.4

  • Added user requested feature to supply a wordlist from a field in a given event
  • Confirmed compatibility with Splunk 7.1

2.0.5

  • Removed configuration to force command to run locally to support distributed streaming
  • Tested compatibility with Splunk 7.2

2.0.6

  • Updated fuzzywuzzy library to 0.17
  • Minor code update for future py3 compat
  • Tested compatibility with Splunk 7.3

2.0.7

  • Tested compatibility with Splunk 8.0 and py3

2.0.8

  • Attempted to make library import more dynamic to fix a possible issue with distributed searching.

2.0.9

  • Updated splunk-sdk to 1.6.14
  • Updated fuzzywuzzy to 0.18.0
  • Tested compatibility with Splunk 8.1

Prerequisites

This search command is packaged with the following external libraries:
+ Splunk SDK for Python version 1.6.6 (http://dev.splunk.com/python)
+ FuzzyWuzzy 0.17.0 (https://github.com/seatgeek/fuzzywuzzy)

Nothing further is required for this add-on to function.

Installation

Follow standard Splunk installation procedures to install this app.

Reference: https://docs.splunk.com/Documentation/AddOns/released/Overview/Singleserverinstall
Reference: https://docs.splunk.com/Documentation/AddOns/released/Overview/Distributedinstall

Usage

Using a static wordlist provided as input

| fuzzy wordlist="svchost.exe" type="simple" compare_field="tester" output_prefix="fuzz" delims="(\\\\)"
  • Wordlist is a comma separated list of words you want to check for fuzzy matches.
  • Type is the type of matching. Reference the library documentation, acceptable values are: simple, partial, token_sort, token_set
  • Compare_field defaults to _raw and is the field you want to do your fuzzy matching in.
  • Output_prefix defaults to 'fuzzywuzzy_'.
  • Delims accepts a regex string, escaped splunk style, and defaults to (\\\\|/|\s+|;|-)

Using a field based wordlist (Version 2.0.4 and later)

| fuzzy wordlist=Creator_Process_Name compare_field=New_Process_Name
  • Wordlist is a field that exists in each event containing a comma separated list of words
  • All other options are the same

Sample Use Cases / Searches

Look for process names similar to svchost.exe

eventtype=win_process_new New_Process_Name=* | fuzzy wordlist="svchost.exe" compare_field="New_Process_Name"

Search for Proxy Logs with domainms similar to your company

eventtype=proxy_logs domain=* | fuzzy wordlist="companydomain1.com,companydomain2.com,companydomain3.com" compare_field="domain"

Performance Considerations

There is a nested loop of death whereby the provided wordlist is split and the given input is split. You can improve your performance in the following ways:

  • Keep your wordlist to a minimum
  • Keep the regex splitting delimeters to a minimum
  • Try to filter data before passing it to this command (i.e. don't pass in useless junk)

I use this command in production and will continue to work on improvements but considering the looping that is done, it may always have performance issues.

How it works (basically):

  1. The wordlist is separated
  2. The comparison field is separated by the delimeter string provided
  3. The two are compared
  4. And add the following values to the event output:
  5. prefix_max_match_word
  6. prefix_max_match_ratio

The ratio will contain a value, 0 to 100, where 100 is a perfect match. The word values will contain what actually matched in the input/wordlist combination.

Support

If support is required or you would like to contribute to this project, please reference: https://gitlab.com/johnfromthefuture/TA-fuzzy. This app is supported by the developer as time allows.

Release Notes

Version 2.0.10
March 1, 2022

2.0.10

  • Updated splunk SDK
  • Tested compatibility with Splunk 8.2
Version 2.0.9
Nov. 13, 2020

2.0.8

  • Attempted to make library import more dynamic to fix a possible issue with distributed searching.

2.0.9

  • Updated splunk-sdk to 1.6.14
  • Updated fuzzywuzzy to 0.18.0
  • Tested compatibility with Splunk 8.1
Version 2.0.7
Jan. 17, 2020

2.0.7
Tested compatibility with Splunk 8 / py3.

Version 2.0.6
Aug. 1, 2019

2.0.6

  • Updated fuzzywuzzy library to 0.17
  • Minor code update for future py3 compat
  • Tested compatibility with Splunk 7.3
Version 2.0.5
Oct. 22, 2018

2.0.5

  • Removed configuration to force command to run locally to support distributed streaming
  • Tested compatibility with Splunk 7.2
Version 2.0.4
May 16, 2018

2.0.4

  • Added user requested feature to supply a wordlist from a field in a given event
  • Confirmed compatibility with Splunk 7.1
Version 2.0.3
March 10, 2018
  • Added appicon images for compatibility with certification.
Version 2.0.2
March 9, 2018
  • Documentation updates based on appinspect output.
Version 2.0.1
March 9, 2018
  • Bug fixes to support multivalue input fields again
Version 2.0
March 9, 2018
  • Migrated from intersplunk to the Splunk SDK for Python.
  • Updated fuzzywuzzy library to latest release
  • Updated readme file to markdown syntax
  • Verified compatibility with Splunk 7.0
  • Set option local = True to force the command to only run on the search head
  • Made a number of workflow improvements, trying to increase command performance.
Version 1.2.2
Nov. 12, 2016

Minor script modifications changing the regex splitting assumptions. If you now choose not to specify a "delimiter" to split up the input field, the script will no longer default to splitting that field. I did this for performance reasons allowing for the possibility to preprocess data before passing it to this script.

Version 1.2.1
Nov. 11, 2016

Version 1.2.1
I put in a bad try/except block typing try/else instead... Fixed.

Version 1.2
Oct. 2, 2016
  • Adds minor error checking for a bad regex
  • Adds searchbnf.conf for documentation and highlighting
Version 1.1
Oct. 2, 2016
  • Minor changes to try to increase performance of the script
  • Verified app continued to function with Splunk 6.5
Version 1.0
March 30, 2016

Version 1.0: Custom search command implementation of FuzzyWuzzy libraries. Reference: https://github.com/seatgeek/fuzzywuzzy


Subscribe Share

Are you a developer?

As a Splunkbase app developer, you will have access to all Splunk development resources and receive a 10GB license to build an app that will help solve use cases for customers all over the world. Splunkbase has 1000+ apps from Splunk, our partners and our community. Find an app for most any data source and user need, or simply create your own with help from our developer portal.

Follow Us:
Splunk, Splunk>,Turn Data Into Doing, Data-to-Everything, and D2E are trademarks or registered trademarks of Splunk Inc. in the United States and other countries. All other brand names,product names,or trademarks belong to their respective owners.