This application accompanies the Splunk conf 2017 presentation "How did you get so big? Tips and tricks for growing your Splunk installation from 50GB/day to 1TB/day"
The overall idea behind this application is to provide a variety of alerts that detect issues or potential issues within the splunk log files and then advise via an alert that this has occurred
This application was built as there were a variety of messages in the Splunk console and logs in Splunk that if acted upon could have prevented an issue within the environment.
The original presentation is available as a recording or PDF
The powerpoint should it be required is available here
There are many potential alerts that might cause an issue so this application has all alerts disabled by default, post-installation once the required macros are configured you can enable the alerts you wish to use and add the required actions
There are also dashboards for investigating indexer performance, heavy forwarder queue usage, data model acceleration issues among other items that may be of interest to a Splunk admin
Please note that the all alerts & dashboards were tested on Linux-based Splunk infrastructure, with AIX, Linux and Windows forwarders
If you are running your Splunk enterprise installation on Windows or have customised your installation directory you will need to customise some of the macros such as splunkadmins_splunkd_source
to point to the correct splunkd log file location
Also note that this application contains a very large number of alerts which you can use, you may wish to utilise the allow_skew
in savedsearches.conf to allow the scheduler to balance out the scheduled alerts execution times
Finally, the application has evolved over the years, more recent releases have resulted in very generic alerts such as AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only
, this is designed as a "catch all" to cover many splunkd log messages. The older alerts are very specific as the team I worked in was new to Splunk and required a more specific outcome/action based on each alert
Feel free to use either, and feedback or contributions via github or email are always welcome
The various saved searches and dashboards use macros within their searches, you will need to update the macros to ensure the searches/dashboards work as expected
To check the contents of the macros in Splunk 7 or newer, use CTRL-SHFT-E within the search window
The macros are listed below, many expect a host=A OR host=B
item to assist in narrowing down a search while others expect only a single value...note that for splunk_server
values they are always lower-case and case-sensitive!
indexerhosts - a host=...
list of your indexers (for example host=indexer1 OR host=indexer2
)
heavyforwarderhosts - a host=...
list of your heavy forwarders (for example host=heavyforwarder1 OR host=heavyforwarder2
)
searchheadhosts - a host=...
list of your search head(s) (for example host=searchhead1 OR host=searchhead2
)
localsearchheadhosts - a host=...
list of your search head(s) within the cluster that these alerts are running on
splunkenterprisehosts - a host=...
list of any Splunk enterprise instance (for example host=indexer1 OR host=searchhead1 OR ...
)
deploymentserverhosts - a host=...
list of deployment server(s) (for example host=splunkdeploymentserver
)
licensemasterhost - a host=...
entry for the license master server (for example host=splunklicensemaster
)
searchheadsplunkservers - a splunk_server=...
list of any Splunk search head hosts (for example splunk_server=searchhead*
)
splunkindexerhostsvalue - a splunk_server=...
list of any Splunk indexer hosts (for example splunk_server=indexer*
)
splunkadmins_splunkd_source
- this defaults to source=*splunkd.log
, for a slight improvement in performance you can make this a specific file such as /opt/splunk/var/log/splunk/splunkd.log
splunkadmins_splunkuf_source
- this defaults to source=*splunkd.log
, you may wish to narrow down this location if your splunkd logs on universal forwarders have consistent installation directories
splunkadmins_mongo_source
- this defaults to source=*mongod.log
, for a slight improvement in performance you can make this a specific file such as /opt/splunk/var/log/splunk/mongod.log
splunkadmins_clustermaster_oshost - a host=...
entry for the cluster master server (for example host=splunkclustermaster
)
The macros are used in various alerts which you can optionally enable, the alerts will raise a triggered alert only as emails are not allowed for Splunk app certification purposes
The macros are also used in the dashboards for this application
The vast majority of the alerts also have a macro(s) which you can customise to tweak the search results, for example the macro splunkadmins_weekly_truncated
allows the alert, IndexerLevel - Weekly Truncated Logs Report
, to be customised without changing the alert itself. This will make upgrading to a new version of this app more straightforward
I have attempted to provide an appropriate macro in any alert where I deemed it appropriate, feedback is welcome for any alert that you believe should have a macro or requires further improvement
The application is designed to work on a search head or search head cluster instance, installation on the indexing tier is not required. You may wish to use your monitoring console server as the search head to run this app on (as it will have splunk_server_groups
configured for your environment).
There are a few searches that use REST API calls which are specific to the search head cluster they run on. These alerts will have to be placed on each search head or search head cluster, alternatively any server with the required search peers will also work, the relevant alerts are:
- SearchHeadLevel - Accelerated DataModels with All Time Searching Enabled
- SearchHeadLevel - Realtime Scheduled Searches are in use
- SearchHeadLevel - Realtime Search Queries in dashboards
- SearchHeadLevel - Scheduled Searches without a configured earliest and latest time
- SearchHeadLevel - Scheduled searches not specifying an index
- SearchHeadLevel - Scheduled searches not specifying an index macro version
- SearchHeadLevel - Scheduled Searches Configured with incorrect sharing
- SearchHeadLevel - Saved Searches with privileged owners and excessive write perms
- SearchHeadLevel - User - Dashboards searching all indexes
- SearchHeadLevel - User - Dashboards searching all indexes macro version
- SearchHeadLevel - Users exceeding the disk quota (recent jobs list uses a REST call so you may need to adjust the search), the SearchHeadLevel - Users exceeding the disk quota introspection is a non-search head specific alternative
The following reports also are specific to a search head or search head cluster:
- SearchHeadLevel - Alerts that have not fired an action in X days
- SearchHeadLevel - Data Model Acceleration Completion Status
- SearchHeadLevel - Macro report
- What Access Do I Have?
The following dashboards are search head or search head cluster specific:
- Data Model Rebuild Monitor
- Data Model Status
The following reports / alert must either run on the cluster master or a server where the cluster master is a peer:
- ClusterMasterLevel - Per index status
- ClusterMasterLevel - Primary bucket count per peer
Once the application is installed, all alerts are disabled by default and you can enable those you require or want to test in your local environment.
If you choose not to customise the macros then many searches will search for all hosts, which will make the alerts and dashboards inaccurate!
The alerts are all useful for detecting a variety of different scenarios which may or may not be applicable within your Splunk environment
The description field has an (extremely) simple way of determining if an alert will require action, there are three levels:
- Low - the alert is informational and likely relates to a potential issue, these alerts may produce false alarms
- Moderate - the alert is a warning, most likely further action will need to be taken, a moderate chance of false alarms
- High - the alert is likely relating to something that requires action and there is a very low chance that this will create false alarms
I do not have a nice way to auto-enable various alerts excluding editing the local/savedsearches.conf or via the GUI, any contribution of a setup file would be welcome here!
In the current environment the vast majority of the alerts are enabled to detect issues, they raise automated tickets or email depending on the urgency of the specific alert.
There are a few environment characteristics that may require changes to the way the app is used, and feedback is welcome if there is a nicer way to structure the alerts/application
The overall assumption is that the admin(s) are not carefully watching the splunkd logs or the messages in the console of the monitoring server/Splunk servers
Before 2019 the universal forwarders in use are installed on a mix of Windows, Linux & AIX servers, in 2019 and beyond the testing scope has been vastly reduced to focus primarily on Splunk enterprise servers
All heavy forwarders, and Splunk enterprise installations are Linux based, while I expect the alerts will work with only changes to the macros.conf for a Windows based environment this remains untested
The test environment for this application has a single indexer cluster and two search head clusters
Inspired by articles such as "Things I wish I knew then" and knowledge collected from various conference replays, SplunkAnswers, 200+ support tickets & nearly four years of working on a Splunk environment I decided that I would attempt to share what I have learned in an attempt to prevent others from repeating the same mistakes
There are many Splunk conf talks available on this subject in various conference replays, however my goal was to provide practical steps to implement the ideas. That is why this application exists
Are all well suited to an automated email using the sendresults command or a similar function as they involve end user configuration which the individual can change/fix
This application was first created in 2017 and both Splunk and the application have evolved during this time period. This application is a library of potential alerts that could be used in a Splunk environment so it would never be a good idea to turn on all alerts from this application.
The below list of alerts and reports are actively used since version 8.0.x and in 8.2.x and eventually 9.0:
- Due to the 20,000 character limit on SplunkBase this is on github
Some CSV lookups are now replaced with kvstore entries due to the ability to sync the kvstore across multiple search head or search head cluster(s) via apps like KV Store Tools Redux
There are a number of reports with the keyword "platform_stats" in the title, these were designed to run mcollect commands and to collect data into a metric index
The metrics then contain detailed information around the number of users using Splunk per-search head cluster, data indexed at the indexing tier, resource usage per user et cetera.
There is plenty of detail in here but dashboards were not included for the information built from them, contributions welcome
As of version 8.0.8 there is still no accurate way to detect which indexes were searched by a user based on their level of access, the audit logs simply do not record which indexes were accessed
Therefore the following searches are part of this app to help achieve this goal:
- SearchHeadLevel - Search Queries summary exact match
- SearchHeadLevel - Search Queries summary non-exact match
As per the searches description they both require other reports such as SearchHeadLevel - Macro report
, the description of each search details the various reports they rely on to make them work.
However these complicated searches are not 100% accurate, alternative searches exist in this app to work at the indexing tier:
- IndexerLevel - RemoteSearches Indexes Stats
- IndexerLevel - RemoteSearches Indexes Stats Wildcard
The remote_searches.log
at the indexing tier does not (usually) need to perform macro substitution but instead you do not have information around the user that ran the searches so this search is more likely to overcount index access than the search tier version, it is also less likely to miss an index due to macro usage or similar...
In more detail, the challenges with the search head level's audit.log
searches are:
- You cannot determine which index was used if multiple indexes were specified, for example a search such as index=A OR index=B
, if this search results in more than 0 results, then you cannot be sure which index returned the results so both are recorded by searches in this app
- macros, eventtypes, tags and datamodels are recorded in the audit.log
so you need to substitute the macro/eventtype/tag to correctly determine if an index is in use, to make this more complicated, macros can be nested so a macro may refer to another macro and the 2nd or 3rd macro may contain the index=
information
- There are many ways to search an index, such as index= ""
, index IN (...)
, the regex'es attempt to deal with the various straightforward scenarios such as NOT index=A index=B
, but it is not straightforward to correctly extract index names from the audit.log
in all scenarios
- The audit.log
information for ad-hoc searches does not record app-context, therefore even if you know the macro and user information, you cannot be sure which app the search was run from and therefore you cannot correctly substitute the macro/tag/eventtype information
At the indexing tier the remote_searches.log
file has different challenges:
- While macros, eventtypes and tags are expanded (in most cases, there are bugs that allow macros to reach the indexing tier), you instead lose the user context in cases such as ad-hoc searches. This means that a search like index=*
run by a user with permissions to access 1 index to these searches will appear to be accessing all indexes. The current implementation of the RemoteSearches queries in this app assume access to all indexes if the username is unknown (which may result in excess matching rather than missing searches)
- app context is again missing for ad-hoc searches, although this is less important at the indexing tier
- You cannot determine which index was used if multiple indexes were specified, for example a search such as index=A OR index=B
, if this search results in more than 0 results, then you cannot be sure which index returned the results so both are recorded
The following ideas relate to this issue:
Better audit logs
Provide index access statistics to assist in capacity planning of the indexing tier
Feel free to provide feedback via SplunkBase and contributions are welcome!
This project is open source and hosted on github SplunkAdmins
TA-Alerts for SplunkAdmins or github
I have also created VersionControl for Splunk, VersionControl for Splunk Cloud, decrypt2
Refer to README.md or github
Icons made by Freepik from www.flaticon.com is licensed by Creative Commons BY 3.0
New reports:
- IndexerLevel - events per second benchmark
- IndexerLevel - savedsearches by indexer execution time
- SearchHeadLevel - indexes per savedsearch
- SearchHeadLevel - macros in use
- SearchHeadLevel - Indexes for savedsearch without subsearches
- SearchHeadLevel - platform_stats.remote_searches metrics populating search 24 hour
Macro updates to use splunkadmins_clustermaster_host
instead of splunk_server=local on cluster manager related savedsearches
Macro updates to use splunkadmins_restmacro
instead of splunk_server=local on related savedsearches
A number of report and alert updates, full information in the README.md or GitHub release
New alerts:
- MonitoringConsole - one or more servers require configuration
- MonitoringConsole - one or more servers require configuration automated
- SearchHeadLevel - Peer timeouts or authentication issues
New macros:
- splunkadmins_macro_sub
New reports:
- SearchHeadLevel - Datamodel REST endpoint indexes in use
- SearchHeadLevel - Job performance data per indexer
- SearchHeadLevel - Jobs endpoint example
- SearchHeadLevel - configtracker index example
Various updated alerts, macros, reports and a dashboard update.
Added supported themes settings in app.conf to allow the usage of dark theme (for 9.1 enterprise users and above)
Updated alerts:
- AllSplunkEnterpriseLevel - ulimit on Splunk enterprise servers is below 8192
- missing parenthesis, thanks Gregg Woodcock
- IndexerLevel - replicationdatareceiverthread close to 100% utilisation
- incorrect macro
- MonitoringConsole - Crash logs have appeared on the filesystem
- incorrect macro, github issue #22, thanks SANSd20
Added lookup file:
- splunkadmins_indexlist_by_cluster.csv
SearchHeadLevel - audit.log - lookup usage
- correcting issue #21 (thanks @barrettnet)From 3.0.9:
Updated alerts:
- SplunkEnterpriseLevel - Splunkd Log Messages Admins Only
- more criteria
- SearchHeadLevel - Scheduled Searches That Cannot Run
- correcting issue #20 (thanks @barrettnet)
New reports:
- SearchHeadLevel - Detect lookups that have not being accessed for a period of time
- SearchHeadLevel - Lookup Editor lookup updates
- SearchHeadLevel - Lookups within dashboards
- SearchHeadLevel - Lookups within savedsearches
- SearchHeadLevel - REST API usage via audit.log
All remaining updates are in the README.md or github
In version 3.0.8 the lookup file splunkadmins_hec_reply_code_lookup.csv
was updated based on gettingsmarter (github repo), the updated lookup was created by @jgedeon and additionally includes some health endpoint return codes (as well as those returned by the standard HEC endpoint)
New reports:
- SearchHeadLevel - Detect lookups that have not being accessed for a period of time
- SearchHeadLevel - Lookup Editor lookup updates
- SearchHeadLevel - Lookups within dashboards
- SearchHeadLevel - Lookups within savedsearches
- SearchHeadLevel - REST API usage via audit.log
Updated alerts:
- SplunkEnterpriseLevel - Splunkd Log Messages Admins Only
- more criteria
- SearchHeadLevel - Scheduled Searches That Cannot Run
- correcting issue #20 (thanks @barrettnet)
Updated reports are listed in the README.md or github
New alerts:
- SearchHeadLevel - summary indexing searches not using durable search
New macros:
- indexer_cluster_name
without any parameters created as per issue #19 (barrettnet)
New reports:
- SearchHeadLevel - audit.log - lookup usage
- SearchHeadLevel - license usage per sourcetype per index
- SearchHeadLevel - Lookup file owners
- IndexerLevel - RemoteSearches - lookup usage
Updated alerts:
- AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only
- more matching criteria
- SearchHeadLevel - Scheduled Searches That Cannot Run
- as per issue #18 (AHCL1)
- SearchHeadLevel - SHC Captain unable to establish common bundle
- additional exclusion for Splunk 9.0.x
Updated reports:
- IndexerLevel - platform_stats.indexers totalgb measurement
- added * to the end of license_usage.log
, updated indexer_cluster_name
with parameter as per issue #19 (barrettnet)
- IndexerLevel - platform_stats.indexers totalgb_thruput measurement
- updated indexer_cluster_name
with parameter as per i
New macros:
- sysloghosts
New reports:
- SearchHeadLevel - Knowledge Bundle contents
- syslog-ng - cache statistics summary
- as contributed by Marc Andersen, company: NIL815 ApS
Updated dashboards:
- splunk_forwarder_output_tuning
- added fillnull for ingest_pipe
Updated various alerts
Updated dashboards:
- Splunk forwarder output tuning
- added fillnull ingest_pipe
Updated reports/alerts:
- SearchHeadLevel - Dashboards using special characters
- updated to use spath command instead of rex
- SearchHeadLevel - Search Messages user level
- excluded require command
- IndexerLevel - RemoteSearches find all time searches
- removed keyword
On reports/alerts:
- IndexerLevel - RemoteSearches Indexes Stats
- IndexerLevel - RemoteSearches Indexes Stats Wilcard
- IndexerLevel - Slow peer from remote searches
- IndexerLevel - SmartStore cache misses - remote_searches
- SearchHeadLevel - platform_stats.remote_searches metrics populating search
Updated keywords to terminated: or closed: (previously terminated)
On various reports/alerts removed regex:
| rex "(?s)^(?:[^'\n]*'){4},\s+\w+='(?P<search>[\s\S]+)'\]($|\[[^\]]+\]$)"
As it is causing issues with max_matches, newer Splunk versions appear to accurately match the search field without this regex
New alerts:
- IndexerLevel - Connection errors to SmartStore
New reports:
- SearchHeadLevel - Sourcetypes usage from search telemetry data
Updated alerts:
- AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only
- more matching criteria
- ForwarderLevel - Data dropping duration
- comment update
- SearchHeadLevel - Search Queries summary exact match
- regex updates and 1 regex removal
- SearchHeadLevel - Search Queries summary non-exact match
- regex updates and 1 regex removal
Updated macro:
- splunkadmins_metrics_source
- corrected to include source=
Removed app.manifest file
New alerts:
- IndexerLevel - Buckets have being frozen due to index sizing SmartStore
Updated alerts:
- AllSplunkEnterpriseLevel - Replication Failures
- comment update
- AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only
- additional criteria and removed SHC restart times
- IndexerLevel - Buckets have being frozen due to index sizing
- comment update only
- IndexerLevel - IndexConfig Warnings from Splunk indexers
- additional criteria
- SearchHeadLevel - Script failures in the last day
- SearchHeadLevel - KVStore Or Conf Replication Issues Are Occurring
- SearchHeadLevel - SavedSearches using special characters
- SearchHeadLevel - Search Messages user level
- removed some messages from the alert
Merged pull request from jeffland-consist via github including various changes
New alerts:
- IndexerLevel - replicationdatareceiverthread close to 100% utilisation
New macros:
- splunkadmins_metrics_source
- splunkadmins_hec_metrics_source
New reports:
- SearchHeadLevel - Accelerated DataModels Access Info
- SearchHeadLevel - Dashboards resulting in concurrency issues
- SearchHeadLevel - Dashboards that may benefit from base or post-process searches
- SearchHeadLevel - Searches by search type
Updated macros:
- splunkadmins_splunkd_source
- splunkadmins_splunkuf_source
- splunkadmins_mongo_source
- splunkadmins_license_usage_source
To include a trailing wildcard (so splunkd.log.1 matches or similar)
Various alerts/reports updated to use splunkadmins_splunkd_source
and/or splunkadmins_metrics_source
macros
New macros:
- splunkadmins_shutdown_time_by_period
New alerts:
- MonitoringConsole - Check OS ulimits via REST
- SearchHeadLevel - Detect bundle pushes no longer occurring
New reports:
- DeploymentServer - Count by application
- contributed by @trex (radler)
- IndexerLevel - DataModel Acceleration - Indexes in use
- SearchHeadLevel - Knowledge bundle status on indexers
- SearchHeadLevel - Knowledge bundle replication times metrics.log
Updated alerts:
- AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only
Updated dashboards:
- splunk_introspection_io_stats
- updated names/description of fields used
- indexer_max_data_queue_sizes_by_name
- minor tweak to replication queue queries
- indexer_max_data_queue_sizes_by_name_v8
- minor tweak to replication queue queries
- splunk_forwarder_output_tuning
- comment update only
Updated macros:
- splunkadmins_shutdown_time_by_period(4)
to work as expected
Added link to Admins Little Helper for Splunk and TrackMe
README.md improvements
Due to the creation of TA-Alerts for SplunkAdmins, the following are removed in this release:
- bin directory
- README directory
- default/searchbnf.conf
- default/inputs.conf
- default/commands.conf
LookupWatcher and the custom commands streamfilter and streamfilterwildcard are now moved into the new TA-Alerts for SplunkAdmins application
New alerts:
- AllSplunkEnterpriseLevel - error in stdout.log
- IndexerLevel - platform_stats.indexers stddev incoming measurement
- MonitoringConsole - Core dumps have appeared on the filesystem
- MonitoringConsole - Crash logs have appeared on the filesystem
- SearchHeadLevel - Splunk Scheduler logs have not appeared in the last
Updated 12 savedsearches as per README.md
Updated python SDK to 1.6.20
Updates to reports/alerts:
IndexerLevel - Future Dated Events that appeared in the last week
- comment upate
IndexerLevel - IndexConfig Warnings from Splunk indexers
- added wildcard to improve matching
Updated regex to handle index:: case on various alerts/reports
Updated links in nav menu:
SideView UI (user activity)
2.6.12 fixes a missing \ character in savedsearches.conf introduced in 2.6.11
Changed:
splunkadmins_userlist_indexinfo from kvstore collection into a csv file to prevent unncessary restarts related to updating this app (on standalone instances this triggers a restart due to collections.conf), collections.conf was removed from this appp
New dashboard:
splunk_introspection_io_stats - just an I/O focussed dashboard based on introspection data
New macros:
splunkadmins_shutdown_time_by_shc
cluster_masters
Updated alerts:
AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only - more criteria
IndexerLevel - IndexConfig Warnings from Splunk indexers - updated criteria
SearchHeadLevel - KVStore Or Conf Replication Issues Are Occurring - updated keywords
SearchHeadLevel - Lookup updates within SHC
Updated dashboards:
heavyforwarders_max_data_queue_sizes_by_name_v8
indexer_max_data_queue_sizes_by_name
smartstore_stats
splunk_forwarder_output_tuning - added attribution link
Changed:
splunkadmins_userlist_indexinfo from kvstore collection into a csv file to prevent unncessary restarts related to updating this app (on standalone instances this triggers a restart due to collections.conf), collections.conf was removed from this appp
New dashboard:
splunk_introspection_io_stats - just an I/O focussed dashboard based on introspection data
New macros:
splunkadmins_shutdown_time_by_shc
cluster_masters
Updated alerts:
AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only - more criteria
IndexerLevel - IndexConfig Warnings from Splunk indexers - updated criteria
SearchHeadLevel - KVStore Or Conf Replication Issues Are Occurring - updated keywords for new instances, added more criteria to reduce false alarms
SearchHeadLevel - Lookup updates within SHC
Updated dashboards:
heavyforwarders_max_data_queue_sizes_by_name_v8
indexer_max_data_queue_sizes_by_name
smartstore_stats
splunk_forwarder_output_tuning - added attribution link
New alert: SearchHeadLevel - Excessive REST API usage
New dashboard: splunk_forwarder_data_balance_tuning - new dashboard based on Brett Adam's work
New macro: diskusage
Various dashboards and alerts updated, full details on github https://github.com/gjanders/SplunkAdmins
Updated alerts:
AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only
- removed 1 log entry for consecutive date entries/unretrievable data
ForwarderLevel - Splunk HEC issues
- added cluster command
New dashboards:
ForwarderLevel - Splunk HEC issues
New reports:
IndexerLevel - SmartStore cache misses - remote_searches
IndexerLevel - Buckets in cache
SearchHeadLevel - Detect searches hitting corrupt buckets
SearchHeadLevel - SmartStore cache misses - savedsearches
SearchHeadLevel - SmartStore cache misses - dashboards
SearchHeadLevel - SmartStore cache misses - combined
Updated SDK to 1.6.18
TERM command removed from various savedsearches as it added no benefit (full details in README.md or github)
New alerts:
AllSplunkLevel - No recent metrics.log data
New dashboards:
heavyforwarders_max_data_queue_sizes_by_name_v8
- this version uses tstats with PREFIX so only works with Splunk 8.0+
indexer_max_data_queue_sizes_by_name_v8
- this version uses tstats with PREFIX so only works with Splunk 8.0+
splunk_forwarder_output_tuning
- using metrics.log to measure the TCP output/stdev per-name, includes example tuning parameters
New reports:
IndexerLevel - platform_stats.indexers stddev measurement
- stdev per indexer cluster (useful for tuning the outputs.conf from incoming servers)
IndexerLevel - platform_stats.indexers totalgb_thruput measurement
- index thruput measurements
Various reports, alerts and macro updates, full details on https://github.com/gjanders/SplunkAdmins/releases/tag/2.6.8
Note that this app is incorrectly flagged as not python3 compatible, as per https://github.com/gjanders/SplunkAdmins/issues/14 this is a false alarm and I'm testing on python3, if you do see issues use the contact developer or github to let me know!
New alerts:
IndexerLevel - SmartStore - Bucket cache errors audit logs
SearchHeadLevel - Accelerated DataModels with wildcard or no index specified
New reports:
IndexerLevel - IndexWriter pause duration
IndexerLevel - RemoteSearches find all time searches
IndexerLevel - RemoteSearches find datamodel acceleration with wildcards
SearchHeadLevel - platform_stats.audit metrics users 24hour
SearchHeadLevel - platform_stats.users dashboards
SearchHeadLevel - platform_stats.users savedsearches
Updated alerts:
Various - refer to details page
Note that this app is incorrectly flagged as not python3 compatible, as per https://github.com/gjanders/SplunkAdmins/issues/14 this is a false alarm and I'm testing on python3, if you do see issues use the contact developer or github to let me know!
Updated to Splunk python SDK 1.1.16
Merged from jordanfelle to remove non-ASCII character
Updated alerts:
SearchHeadLevel - dispatch metadata files may need removal
SearchHeadLevel - Dashboards with all time searches set
To remove non-ASCII characters
New reports:
IndexerLevel - RemoteSearches Indexes Stats Wilcard
- example wildcard match for remote_searches.log
SearchHeadLevel - Index list by cluster report
- for a list of indexes by indexer cluster
Updated reports:
IndexerLevel - RemoteSearches Indexes Stats
- added additional info around bucket cache usage, improved accuracy, provided mcollect example
IndexerLevel - Slow peer from remote searches
- added more search types into the list
SearchHeadLevel - Search Queries summary exact match
- improved accuracy for append/join/multisearch/set
SearchHeadLevel - Search Queries summary non-exact match
- improved accuracy for append/join/multisearch/set
Updated alerts:
AllSplunkEnterpriseLevel - Splunk Servers with resource starvation
- as per github issue #12, thanks RahimAbdulla
SearchHeadLevel - Detect MongoDB errors
- fix the alert by re-adding the fillnull into the subsearch
Various other updates as per README.md
Updated alerts:
AllSplunkLevel - Splunk forwarders that are not talking to the deployment server - contribution via email (Vincent)
AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only - a few new additions
SearchHeadLevel - datamodel errors in splunkd - excluded kvstore shutdown
SearchHeadLevel - Search Messages admins only - new exclusions
Updated dashboard:
issues_per_sourcetype - the Invalid parsed time panel needed another regex - contribution via email (Vincent)
Updated reports:
SearchHeadLevel - Search Queries summary exact match - minor updates, added cache stats, improved accuracy
SearchHeadLevel - Search Queries summary non-exact match - minor updates, added cache stats, improved accuracy
Renamed/replaced reports:
SearchHeadLevel - Search Queries summary exact match 73 - new name is SearchHeadLevel - Search Queries summary exact match
SearchHeadLevel - Search Queries summary non-exact match 73 - new name is SearchHeadLevel - Search Queries summary non-exact match
New alert:
SearchHeadLevel - authorize.conf settings will prevent some users from appearing in the UI
Updated alerts:
AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only - a few more errors
SearchHeadLevel - Search Messages user level - updated comment, added sid field
SearchHeadLevel - Search Messages admins only - added sid field
SearchHeadLevel - Detect MongoDB errors - added partial flag to remove false alarms (thanks afx)
IndexerLevel - Timestamp parsing issues combined alert - update to provide a list of hosts per sourcetype
Updated dashboards:
detect_excessive_search_use - removing ldap query section (as this is env specific)
issues_per_sourcetype - wording update on title
knowledge_objects_by_app - corrected drilldown link to point to the SplunkAdmins app (thanks Vincent!)
Updated Splunk python SDK to 1.6.15
Please note 2.6.0's release notes are here https://github.com/gjanders/SplunkAdmins (major release)
2.6.2: Re-release to pass automated app inspect (identical to 2.6.1), 2.6.1:
2 nav menu items fixed (incorrect alert names) by pull request from EsOsO
New alerts:
SearchHeadLevel - Splunk alert actions exceeding the max_action_results limit - detect if any alert action exceeds the limit and receives limited results, currently a silent failure as per https://ideas.splunk.com/ideas/EID-I-781
Updated alerts:
AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only
IndexerLevel - Search Failures
SearchHeadLevel - Detect MongoDB errors - added missing | symbol as per email update from afx
SearchHeadLevel - Search Messages user level
SearchHeadLevel - Search Messages admins only
SearchHeadLevel - SHC Captain unable to establish common bundle
SearchHeadLevel - Splunk alert actions exceeding the max_action_results limit
Please note 2.6.0's release notes are here https://github.com/gjanders/SplunkAdmins (major release)
2.6.1:
2 nav menu items fixed (incorrect alert names) by pull request from EsOsO
New alerts:
SearchHeadLevel - Splunk alert actions exceeding the max_action_results limit - detect if any alert action exceeds the limit and receives limited results, currently a silent failure as per https://ideas.splunk.com/ideas/EID-I-781
Updated alerts:
AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only
IndexerLevel - Search Failures
SearchHeadLevel - Detect MongoDB errors - added missing | symbol as per email update from afx
SearchHeadLevel - Search Messages user level
SearchHeadLevel - Search Messages admins only
SearchHeadLevel - SHC Captain unable to establish common bundle
SearchHeadLevel - Splunk alert actions exceeding the max_action_results limit
Various README.md updates
New alerts:
AllSplunkEnterpriseLevel - Splunkd Log Messages Admins Only (generic alert)
DeploymentServer - Error Found On Deployment Server
SearchHeadLevel - Dashboards invalid character in splunkd
SearchHeadLevel - savedsearches invalid character in splunkd
SearchHeadLevel - datamodel errors in splunkd
SearchHeadLevel - Search Messages user level - this searches the splunk search messages and looks for errors that should be actionable by a end user (generic alert)
SearchHeadLevel - Search Messages admins only this alert searches the splunk search messages but is designed to find errors that cannot be fixed by end users (generic alert)
Updated alerts:
Various, refer to the README.md file for more details
Renamed alert:
IndexerLevel - Splunk Indexers Losing Contact With Master to AllSplunkEnterpriseLevel - Losing Contact With Master Node
Removed alert:
IndexerLevel - Unable to replicate thawed directories in a cluster
Update Splunk python SDK to 1.6.14
New alerts:
IndexerLevel - Slow peer from remote searche
Updated dashboard:
hec_performance as per pull request from jordanfelle
2.5.13 is identical to 2.5.12 and includes an extra lookup file to pass appinspect
New alerts:
SearchHeadLevel - splunk_search_messages dispatch
SearchHeadLevel - WLM aborted searches
SearchHeadLevel - dispatch metadata files may need removal
Minor changes to reports:
SearchHeadLevel - Search Queries summary exact match 73
SearchHeadLevel - Search Queries summary non-exact match 73
And macro:
splunkadmins_audit_logs_datamodel_sub
Updated alert:
SearchHeadLevel - Dashboards with all time searches set to look for earliest= in tokens and to ignore that case
Updated reports:
SearchHeadLevel - Indexer Peer Connection Failures
SearchHeadLevel - Detect searches hitting corrupt buckets
The above were updated to use splunk_search_messages
sourcetype
IndexerLevel - Knowledge bundle upload stats updated to handle cascading bundle replication
Added notes around the log_search_messages property under [search] in limits.conf
New macros:
conf_rest_endpoint
splunkadmins_epoch
splunkadmins_audit_logs_datamodel_sub
splunkadmins_audit_logs_eventtypes_sub
splunkadmins_audit_logs_macro_sub_v8 - note this version uses mvmap so Splunk v8, the splunkadmins_audit_logs_macro_sub still exists for pre-version 8 but can only replace 1 macro per run...
splunkadmins_audit_logs_tags_sub
New reports:
SearchHeadLevel - DataModels report
SearchHeadLevel - Tags report
SearchHeadLevel - EventTypes report
Updated dashboard troubleshooting_resource_usage_per_user_drilldown to display the correct time range for more searches
Updated reports:
IndexerLevel - RemoteSearches Indexes Stats - to summarize indexes stats
SearchHeadLevel - Scheduled searches not specifying an index macro version
SearchHeadLevel - User - Dashboards searching all indexes macro version
SearchHeadLevel - Search Queries By Type Audit Logs macro version
SearchHeadLevel - Search Queries By Type Audit Lo
Updated to Splunk python SDK 1.6.13 (previous 2.5.9 did not include this update)
New alerts:
AllSplunkLevel - TailReader Ignoring Path
ForwarderLevel - Channel churn issues
SearchHeadLevel - Dashboards with all time searches set
New reports:
SearchHeadLevel - audit logs showing all time searches
Updated reports:
SearchHeadLevel - Macro reportto use the new macro
SearchHeadLevel - Search Queries summary exact match 73
to use the new macro
SearchHeadLevel - Search Queries summary non-exact match 73` to use the new macro
New macros:
splunkadmins_splunk_server_name
New alerts:
AllSplunkLevel - Unexpected termination of a Splunk process windows
AllSplunkLevel - Unexpected termination of a Splunk process unix
IndexerLevel - strings_metadata triggering bucket rolling
New reports:
ForwarderLevel - Data dropping duration
SearchHeadLevel - Lookup CSV size
New dashboards:
lookup_audit
New macro:
mylookups (7.3.3+ only)
New nav menu items:
Hyperlink to https://github.com/silkyrich/cluster_health_tools
Updated to Splunk python SDK 1.6.12
Set python.version = python3 within inputs.conf.spec as per appinspect requirement
New alerts:
ClusterMasterLevel - excess buckets on master
Updated alerts:
ForwarderLevel - Splunk HEC issues - corrected criteria for newer Splunk versions and added more matching in
SearchHeadLevel - SHC Captain unable to establish common bundle - to remove special character from comment
Renamed alert:
IndexerLevel - Buckets are been frozen due to index sizing to IndexerLevel - Buckets have being frozen due to index sizing (as requested by woodcock)
New reports:
SearchHeadLevel - Dashboards using special characters
SearchHeadLevel - SavedSearches using special characters
Moved lib directory to bin/lib (as this does not distribute to the indexers otherwise, sent feedback on https://dev.splunk.com/enterprise/docs/python/sdk-python/howtousesplunkpython/howtocreatemodpy/ so this gets updated)
New macro:
base64decode this macro requires decrypt or a similar app to be useful but the searches utilising this will work fine without it...
New reports:
SearchHeadLevel - platform_stats.audit metrics searches
SearchHeadLevel - platform_stats.audit metrics users
SearchHeadLevel - platform_stats.audit metrics api
The above 3 replace SearchHeadLevel - platform_stats.audit metrics
which is now removed.
New reports continued:
IndexerLevel - RemoteSearches Indexes Stats
SearchHeadLevel - DataModel Fields
SearchHeadLevel - Dashboard refresh intervals
SearchHeadLevel - Dashboards using depends and running searches in the background
SearchHeadLevel - Summary searches using realtime search scheduling
SearchHeadLevel - Searches dispatched as owner by other users
Various report updates...
Further updates to the new reports from 2.5.5 relating to platform stats, improved accuracy with identifying dashboard usage vs ad-hoc searches
Lookup Watcher now imports six from lib directory (allows this to work on older Splunk versions)
Minor update to props.conf for splunk:search:info as in 7.3 auto-finalized messages are now INFO level
Updated SearchHeadLevel - platform_stats access summary to include searches triggered (which are often coming from dashboard usage)
New report:
SearchHeadLevel - platform_stats.remote_searches metrics populating search
Updated reports:
IndexerLevel - platform_stats.counters hosts
IndexerLevel - platform_stats.counters hosts 24hour
IndexerLevel - platform_stats.indexers totalgb measurement
SearchHeadLevel - SHC conf log summary
SearchHeadLevel - platform_stats.audit metrics
SearchHeadLevel - platform_stats.user_stats.introspection metrics populating search
SearchHeadLevel - platform_stats access summary
New macro:
search_type_from_sid
Lookup Watcher now imports six from lib directory (allows this to work on older Splunk versions)
Minor update to props.conf for splunk:search:info as in 7.3 auto-finalized messages are now INFO level
Various new summary reports that record platform level metrics/stats
New alert:
SearchHeadLevel - SHC Captain unable to establish common bundle
New reports:
IndexerLevel - platform_stats.counters hosts
IndexerLevel - platform_stats.counters hosts 24hour
IndexerLevel - platform_stats.indexers totalgb measurement
SearchHeadLevel - SHC conf log summary
SearchHeadLevel - platform_stats.audit metrics
SearchHeadLevel - platform_stats.user_stats.introspection metrics populating search
SearchHeadLevel - platform_stats access summary
Updated dashboard:
indexer_max_data_queue_sizes_by_name
New macro:
search_head_cluster
Identical to 2.5.3, re-release due to SplunkBase issue
Changes for python3 compatability
Updated python SDK to 1.6.11 (from 1.6.6)
Lookup files are now included (zero sized)
New macros:
splunkadmins_audit_logs_macro_sub
splunkadmins_remote_macros (this macro requires Webtools Add-on)
splunkadmins_remote_roles (this macro requires Webtools Add-on)
New reports:
SearchHeadLevel - Search Queries summary exact match 73
SearchHeadLevel - Search Queries summary exact match 73 by user (uses Search Queries summary exact match 73 as base)
SearchHeadLevel - Search Queries summary exact match 73 by index (uses Search Queries summary exact match 73 as base)
SearchHeadLevel - Search Queries summary non-exact match 73
Updated alerts:
IndexerLevel - Time format has changed multiple log types in one sourcetype
IndexerLevel - Timestamp parsing issues combined alert
Updated dashboard:
issues_per_sourcetype
Full details in the README.md or details tab
New modular input - Lookup Watcher - details in the README.md file
Introduced a new sub-menu in the navigation menu for Search Head Level "Recommended (externally hosted)" with links to external dashboards
Updated reports:
SearchHeadLevel - Search Queries By Type Audit Logs
SearchHeadLevel - Search Queries By Type Audit Logs macro version
SearchHeadLevel - Search Queries By Type Audit Logs macro version other
To reduce the number of unknown queries
Updated reports:
SearchHeadLevel - Search Queries summary exact match
SearchHeadLevel - Search Queries summary non-exact match
To improve the statistics around indexes found
Please note that if you are upgrading from a version pre 2.5.0 the bin/splunklib directory can be deleted from this app ($SPLUNK_HOME/etc/apps/SplunkAdmins/bin/splunklib is no longer required)
If you like this app you may also be interested in VersionControl For Splunk https://splunkbase.splunk.com/app/4355/
Updated alert - SearchHeadLevel - Scheduled Searches That Cannot Run
to find more results
Updated dashboard issues per sourcetype
to handle message becoming event_message in newer Splunk versions (7.1 or 7.2)
Updated macros splunkadmins_shutdown_list
, splunkadmins_shutdown_keyword
, splunkadmins_shutdown_time
, splunkadmins_transfer_captain_times
to handle message becoming event_message in newer Splunk versions (7.1 or 7.2)
Updated python files streamfilter/streamfilterwildcard to import lib relative to the current app name
Updated many alerts/reports to handle the message field becoming event_message in newer Splunk versions (7.1 or 7.2), full list in the details tab or the README.md file
Please note that if you are upgrading from a version pre 2.5.0 the bin/splunklib directory can be deleted from this app ($SPLUNK_HOME/etc/apps/SplunkAdmins/bin/splunklib is no longer required)
If you like this app you may also be interested in VersionControl For Splunk https://splunkbase.splunk.com/app/4355/
New macro - splunkadmins_shutdown_keyword
New report - IndexerLevel - Knowledge bundle upload stats
Updated alert - AllSplunkEnterpriseLevel - Replication Failures with new criteria and excluded shutdowns
Updated alert - AllSplunkEnterpriseLevel - Splunk Scheduler skipped searches and the reason to handle another skipped scenario
Updated alert - AllSplunkEnterpriseLevel - Splunk Servers with resource starvation with new comments
Updated alert - SearchHeadLevel - Detect MongoDB errors with update to handle tstats issue in Splunk (issue #3 in github)
Moved splunklib from bin to lib directory as per new appinspect recommendations
Please note that if you are upgrading from an older version the bin/splunklib directory can be deleted from this app ($SPLUNK_HOME/etc/apps/SplunkAdmins/bin/splunklib is no longer required)
If you like this app you may also be interested in VersionControl For Splunk https://splunkbase.splunk.com/app/4355/
As a Splunkbase app developer, you will have access to all Splunk development resources and receive a 10GB license to build an app that will help solve use cases for customers all over the world. Splunkbase has 1000+ apps from Splunk, our partners and our community. Find an app for most any data source and user need, or simply create your own with help from our developer portal.