ctrl+shift+p filters: :st2 :st3 :win :osx :linux
Browse

Find​In​Project

by Wramberg ST3

Text search plugin for Sublime Text 3 projects

Labels find, search, project

Details

Installs

  • Total 887
  • Win 463
  • OS X 240
  • Linux 184
Oct 22 Oct 21 Oct 20 Oct 19 Oct 18 Oct 17 Oct 16 Oct 15 Oct 14 Oct 13 Oct 12 Oct 11 Oct 10 Oct 9 Oct 8 Oct 7 Oct 6 Oct 5 Oct 4 Oct 3 Oct 2 Oct 1 Sep 30 Sep 29 Sep 28 Sep 27 Sep 26 Sep 25 Sep 24 Sep 23 Sep 22 Sep 21 Sep 20 Sep 19 Sep 18 Sep 17 Sep 16 Sep 15 Sep 14 Sep 13 Sep 12 Sep 11 Sep 10 Sep 9 Sep 8
Windows 0 2 0 0 1 3 3 1 0 0 1 1 1 1 0 0 0 3 0 1 1 2 1 0 0 0 1 0 0 3 0 2 0 1 1 0 1 0 0 0 1 2 3 2 1
OS X 0 0 0 1 2 0 0 0 0 0 0 0 1 2 0 1 0 0 2 0 1 1 0 1 1 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0
Linux 0 0 0 0 1 1 2 1 0 0 0 1 0 0 0 0 1 0 2 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 2 1 0 0 0 2 0 0 0 1 0

Readme

Source
raw.​githubusercontent.​com

FindInProject

Text search plugin for Sublime Text 3 projects.

This is an alternative to the default “Find in files” command that comes with Sublime Text. It includes an interactive result view and a configurable search thread that runs in the background.

Results are ordered using basic implementations of the TF-IDF and Page-Rank algorithms.

example.gif

Installation

The plugin is tested on Windows and Linux but should also work on macOS. To install it from https://packagecontrol.io/ do the following:

  1. Open the command palette and find “Package Control: Install Package”
  2. Search for FindInProject and install.

To install from GitHub do the following:

  1. Locate Sublime Text packages folder by choosing the menu:
Preferences -> Browse Packages...
  1. Clone or download git repository into a new folder named “FindInProject” under the packages folder

Configuration

All configuration is available through the menu:

Preferences -> Package Settings -> FindInProject

This includes

  • Default settings which can be copied into the user settings and then changed
  • Default keymap which can overridden in the user keymap
  • Default color scheme which can be copied into the user color scheme and then changed

The settings include options for

  • Encodings to try
  • Maximum line length in result view
  • Directories and file extensions to ignore
  • File sizes to ignore
  • Excessive hit count (to cancel large searches)
  • Term separation regex pattern (defaults to non-word characters such as whitespace and punctuation)
  • Page reference regex pattern (defaults to double square brackets around word characters, for example [[SamplePage]])
  • and more (descriptive comments are included in the settings file)

Usage

In normal contexts (using the default keymap) the following shortcut is available.

Shortcut Command Description
ctrl+shift+f find_in_project Opens FindInProject input panel

When in a result view (using the default keymap) the following shortcuts are available.

Shortcut / Action Command Description
up / down find_in_project_next_line Browse back / forward in results
Pageup / Pagedown find_in_project_next_file Browse back / forward between files
Left / Right find_in_project_fold Fold / Unfold results within the selected file
Enter / Double-click find_in_project_open_result Open currently selected result

For details see the keymap file available through the menu:

Preferences->Package Settings->FindInProject* menu.

TF-IDF Algorithm

The term frequency (TF) inverse document frequency (IDF) algorithm scans all files during the document indexing process and extracts from each file a set of terms. These terms are added to both the overall dictionary and the current document's dictionary and count the number of times each term is used. These counts are normalised to the total number of terms in document and inserted into the IDF map.

The search queries terms are used to calculate a score per document using:

doc_score = SUM (search_term.normal + document_term.normal) / len(overall_terms)
                FOR search_term in search_terms
                IF search_term in document_terms
                WHERE document_term = document_terms[search_term]

Each document term's score is checked against the threshold constant to determine if this document should be included in the results. The score is merged with the other search scores using a weighted average.

Related settings: * find_in_project_term_threshold - The search term score threshold for inclusion in the results * find_in_project_terms_weight - The document terms score weighting used in calculating the weighted average * find_in_project_term_separator_pattern - The regular expression uses to split the contents of a file into terms

Page Rank Algorithm

The page rank algorithm identifies page references within each file where each page name is derived from the filename by removing the path and extension. Page references are (by default) just a page name surrounded by double square brackets.

Each file is scanned as part of the document indexing process and is added to the graph of page nodes with all page reference being added as links in the graph to other page nodes.

Once all files have been scanned then the page rank algorithm is calculates the page rank scores as follows:

DEF calculate_ranks:
    FOR page IN graph
        ranks[page] = (1 - damping) / len(graph) +
                        damping *
                            SUM ranks[in_page] / in_page.out_count
                                FOR in_page IN page.in_links

This is repeated using an iterative approach, where each iteration calculates the next set of ranks until sum of the deltas between successive rank scores is less than some specified threshold (epsilon):

ranks = MAP page: 1 / len(graph) FOR page IN graph

WHILE delta > epsilon
    WHERE delta = SUM abs(ranks[page] - prev_ranks[page])
        FOR page IN graph

    prev_ranks = COPY(ranks)
    calculate_ranks()
}

Each page rank score is merged with the other search scores using a weighted average.

Related settings: * find_in_project_disable_page_rank - If set to true, then the page rank algorithm will not be used * find_in_project_page_ref_pattern - The regular expression used to identify page references with a file * find_in_project_page_rank_damping - The page rank damping used in calculating the page rank in each iteration * find_in_project_page_rank_epsilon - The page rank epsilon used in determining when to stop iterating * find_in_project_ranks_weight - The page rank score weighting used in calculating the weighted average

Mouse events

The current implementation of sublime allows simple mouse event mappings, however these mappings do not support selector contexts. As such, any mapping may conflict with other mappings defined.

The default handling of the double-click mouse event is to select and find the word double clicked. The command find_in_project_mouse_event has been added to provide support for selector contexts. This allows a custom mapping for the double-click mouse event to support both actions depending on selector context, for example:

File ${USER}/AppData/Roaming/Sublime Text 3/Packages/User/Default.sublime-mousemap:

[{
        "button": "button1", "count": 2,
        "command": "find_in_project_mouse_event",
        "args": { "commands": [
            {
                "command": "find_in_project_open_result",
                "run_for_selector": "text.findinproject"
            },
            {
                "command": "find_under_expand",
            }
        ]}
    }]

Releases

  • 0.1.0: Initial version
    • Searches all files in project and displays results in separate buffer.
  • 0.2.0: Enhanced search
    • Basic implementations of the TF-IDF and page-rank algorithms.
  • 0.2.1: Usablity tweaks
    • Display query in result buffer.
    • Double-click to open result.