Page Menu
Home
DevCentral
Search
Configure Global Search
Log In
Files
F12297888
No One
Temporary
Actions
View File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
16 KB
Referenced Files
None
Subscribers
None
View Options
diff --git a/README.md b/README.md
index b7a3221..a4a8719 100644
--- a/README.md
+++ b/README.md
@@ -1,133 +1,133 @@
# Merge dictionaries
## Root problem
You uses everal IDEs and each maintain its own spelling dictionary.
You want to merge them so words from PyCharm are available in PhpStorm too.
## Usage
### Merge all dictionaries
To discover dictionaries in your computer, extract words and merge them:
```shell
$ merge-dictionaries --merge
```
This is a potentially destructive operation:
your dictionary files will be overwritten.
### Extract dictionaries words
To print all the words:
```shell
$ merge-dictionaries --extract
```
This is a safe operation.
### Build an Hunspell-compatible dictionary
To create a personal dictionary file for your Hunspell dictionary:
```shell
$ merge-dictionaries --extract > $HOME/.hunspell_default
```
This is a safe read-only operation for your IDE files. This can
overwrite your default Hunspell dictionary if it already exists.
### Build a dictionary in a IDE specific format
You can specify `--format=<format>` as argument to the extract task:
```shell
$ merge-dictionaries --extract --format=JetBrains
```
It will output a dictionary file you can use in any IDE compatible with that format.
This is a safe read-only operation.
### Sync with a Git repository
Create a `$HOME/.config/merge-dictionaries.conf` with the following content:
```yaml
git:
- git@github.com:luser/dictionary.git
```
See below if you wish to host the Git repository locally.
+### Delete words from a dictionary
+
+Now your dictionaries are synced, it can be tricky to delete words from them,
+as the next sync will overwrite them and restore words you removed if still in
+the Git repository or one local fi
+le.
+
+If you added a word in a dictionary, you can delete it:
+
+```shell
+$ merge-dictionaries --delete-words <word> [word ...]
+```
+
+This is a potentially destructive operation:
+your dictionary files will be overwritten.
+
## IDE support
Currently, the following IDEs are supported
* All JetBrains IDEs: application-level dictionary
* Hunspell: read personal dictionaries
* Git repository
## Extend the code
### How to add an IDE?
To add an IDE, you need to provide the following methods:
* sources
* a list of paths candidates for the IDE dictionary
* a method extracting words from the dictionary
* output
* a method to dump the extracted words in the IDE format
* write
* a method to save the files, normally you can call the ones created
+ * a method to rewrite a file with a list of words, so delete works too
### How can I contribute?
You can commit your changes to the upstream by following instructions at https://agora.nasqueron.org/How_to_contribute_code
The canonical repository is https://devcentral.nasqueron.org/source/merge-dictionaries.git
## FAQ
-### Delete a word
-
-Not yet implemented. Here a proposal to implement this.
-
-Curently, the workflow is:
-
-[ extract ] -> { words } -> [ publish ]
-
-You want to add a new transformation step:
-
-[ extract ] -> { words } -> [ transform ] -> { words cleaned up } -> [ publish ]
-
-Add a transform step with an allowlist of the words to remove.
-
-It's not easy to detect if the user has removed a word explicitly
-from a dictionary, as we don't cache extracted words.
-
### Host locally the Git repository
If you want to host the repository on your local machine, use a bare repository:
```shell
$ git init --bare ~/.cache/dictionary
Initialized empty Git repository in /usr/home/luser/.cache/dictionary/
```
You can push to a bare repository, but non-bare ones are protected against pushes,
to avoid a desync between your index and the working files.
Alternatively, you can prepare a script to do this sequence of operation:
```shell
$ merge-dictionaries --merge
$ cd ~/.cache/dictionary
$ git reset
```
## License
BSD-2-Clause, see [LICENSE](LICENSE) file.
diff --git a/setup.cfg b/setup.cfg
index 76bac5e..04b207f 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -1,33 +1,33 @@
[metadata]
name = merge-dictionaries
-version = 0.3.0
+version = 0.4.0
author = Sébastien Santoro
author_email = dereckson@espace-win.org
description = Merge dictionaries
long_description = file: README.md
long_description_content_type = text/markdown
license = BSD-2-Clause
license_files = LICENSE
url = https://devcentral.nasqueron.org/source/merge-dictionaries/
project_urls =
Bug Tracker = https://devcentral.nasqueron.org/tag/development_tools/
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: BSD License
Operating System :: OS Independent
Environment :: Console
Intended Audience :: Developers
Topic :: Software Development :: Build Tools
[options]
package_dir =
= src
packages = find:
scripts =
bin/merge-dictionaries
python_requires = >=3.7
install_requires =
PyYAML>=6.0,<7.0
[options.packages.find]
where = src
diff --git a/src/mergedictionaries/actions/__init__.py b/src/mergedictionaries/actions/__init__.py
new file mode 100644
index 0000000..49b79cd
--- /dev/null
+++ b/src/mergedictionaries/actions/__init__.py
@@ -0,0 +1,9 @@
+# -------------------------------------------------------------
+# Merge dictionaries :: Actions
+# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+# Project: Nasqueron
+# License: BSD-2-Clause
+# -------------------------------------------------------------
+
+
+from .delete_words import DeleteAction
diff --git a/src/mergedictionaries/actions/delete_words.py b/src/mergedictionaries/actions/delete_words.py
new file mode 100644
index 0000000..ace530c
--- /dev/null
+++ b/src/mergedictionaries/actions/delete_words.py
@@ -0,0 +1,46 @@
+#!/usr/bin/env python3
+
+# -------------------------------------------------------------
+# Delete words from all dictionaries
+# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+# Project: Nasqueron
+# Description: Delete words from all dictionaries from
+# all currently found sources.
+# License: BSD-2-Clause
+# -------------------------------------------------------------
+
+
+from mergedictionaries import sources, write
+from mergedictionaries.sources import GitRepository
+
+
+class DeleteAction:
+ def __init__(self, context):
+ self.git_config = context["config"].get("git", [])
+
+ if "git" not in context:
+ context["git"] = []
+ self.git_cached_repos = context["git"]
+
+ def run(self, words):
+ for source in self.get_sources():
+ for arg in source["query"]():
+ source["delete"](words, arg)
+
+ def get_sources(self):
+ return [
+ {
+ "query": sources.jetbrains.find_application_level_dictionaries,
+ "delete": lambda words, file: write.jetbrains.delete_words(file, words),
+ },
+ {
+ "query": self.query_git_repositories,
+ "delete": lambda words, repo: write.git.delete_words(repo, words),
+ },
+ ]
+
+ def query_git_repositories(self):
+ return [
+ GitRepository(git_repo_url, self.git_cached_repos)
+ for git_repo_url in self.git_config
+ ]
diff --git a/src/mergedictionaries/app/app.py b/src/mergedictionaries/app/app.py
index db8bbd5..fd4cd90 100644
--- a/src/mergedictionaries/app/app.py
+++ b/src/mergedictionaries/app/app.py
@@ -1,161 +1,173 @@
#!/usr/bin/env python3
# -------------------------------------------------------------
# Merge dictionaries
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Project: Nasqueron
# Description: Merge dictionaries from various sources,
# mainly IDEs, and allow to propagate them.
# License: BSD-2-Clause
# -------------------------------------------------------------
import argparse
import os
import sys
import yaml
from mergedictionaries import write, output, sources
+from mergedictionaries.actions import DeleteAction
# -------------------------------------------------------------
# Extract words
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def get_dictionary_formatters():
return {
"JetBrains": output.jetbrains.dump,
}
# -------------------------------------------------------------
# Configuration
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def get_configuration_path():
return os.environ["HOME"] + "/.config/merge-dictionaries.conf"
def parse_configuration():
try:
with open(get_configuration_path()) as fd:
return yaml.safe_load(fd) or {}
except OSError:
return {}
# -------------------------------------------------------------
# Application entry point
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def parse_arguments():
parser = argparse.ArgumentParser(description="Merge dictionaries.")
parser.add_argument(
"--extract",
action="store_const",
dest="task",
const="extract",
help="Extract all words from found dictionaries",
)
parser.add_argument(
"--format", action="store", help="Specifies the output format", default="text"
)
parser.add_argument(
"--merge",
action="store_const",
dest="task",
const="merge",
help="Merge all found dictionaries",
)
+ parser.add_argument(
+ "-D",
+ "--delete-words",
+ metavar="word",
+ dest="delete_words",
+ nargs="+",
+ help="Delete one or more words from the dictionaries",
+ )
+
return parser.parse_args()
class Application:
def __init__(self):
self.context = {"git": {}}
def run(self):
args = parse_arguments()
- if args.task is None:
+ task = "delete" if args.delete_words else args.task
+ if task is None:
print("No task has been specified.", file=sys.stderr)
sys.exit(1)
self.context["config"] = parse_configuration()
self.context["args"] = args
- if args.task == "extract":
+ if task == "extract":
self.run_extract_all_words(args.format)
- elif args.task == "merge":
+ elif task == "merge":
self.run_merge()
+ elif task == "delete":
+ action = DeleteAction(self.context)
+ action.run(args.delete_words)
+
+ self.on_exit()
+ sys.exit(0)
def get_dictionary_writers(self):
return [
lambda words: write.jetbrains.write(words),
lambda words: write.git.write(
words, self.context["config"].get("git", []), self.context["git"]
),
]
def run_merge(self):
words = self.extract_all_words()
for method in self.get_dictionary_writers():
method(words)
- self.on_exit()
-
def get_words_sources(self):
return [
lambda: sources.git.extract_words_from_all_dictionaries(
self.context["config"].get("git", []), self.context["git"]
),
lambda: sources.jetbrains.extract_words_from_all_dictionaries(),
lambda: sources.hunspell.extract_words_from_all_dictionaries(),
]
def extract_all_words(self):
return sorted(
{word for method in self.get_words_sources() for word in method()}
)
def run_extract_all_words(self, words_format):
words = self.extract_all_words()
# Trivial case
if words_format == "text" or words_format == "hunspell":
if words_format == "hunspell":
print(len(words))
for word in words:
print(word)
- self.on_exit()
- sys.exit(0)
+ return
# We need a specific formatter
formatters = get_dictionary_formatters()
if words_format not in formatters:
print(f"Unknown format: {words_format}", file=sys.stderr)
self.on_exit()
sys.exit(2)
print(formatters[words_format](words))
- self.on_exit()
- sys.exit(0)
def on_exit(self):
"""Events to run before exiting to cleanup resources."""
- sources.git.on_exit(self.context["git"])
+ sources.git.on_exit(self.context.get("git", {}))
def run():
app = Application()
app.run()
diff --git a/src/mergedictionaries/utils/collections.py b/src/mergedictionaries/utils/collections.py
new file mode 100644
index 0000000..4e59383
--- /dev/null
+++ b/src/mergedictionaries/utils/collections.py
@@ -0,0 +1,31 @@
+# -------------------------------------------------------------
+# Merge dictionaries :: Utilities :: Collections
+# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+# Project: Nasqueron
+# Description: Helper functions for lists
+# License: BSD-2-Clause
+# -------------------------------------------------------------
+
+
+from typing import List
+
+
+def remove_words(current_words: List, words_to_delete: List) -> List:
+ """
+ Removes specified words from a list of words.
+
+ Parameters:
+ current_words: list
+ The list of words from which specified words are to be removed.
+ words_to_delete: list
+ The list containing words that need to be removed.
+
+ Returns:
+ list
+ A new list containing words from the current_words list that are not
+ present in the words_to_delete list.
+ """
+ words = list(set(current_words) - set(words_to_delete))
+ words.sort()
+
+ return words
diff --git a/src/mergedictionaries/write/git.py b/src/mergedictionaries/write/git.py
index ff9873e..ec7ba7d 100644
--- a/src/mergedictionaries/write/git.py
+++ b/src/mergedictionaries/write/git.py
@@ -1,33 +1,52 @@
# -------------------------------------------------------------
# Merge dictionaries :: Publishers :: Git repository
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Project: Nasqueron
# Description: Find application-level dictionaries
# from Git repository
# License: BSD-2-Clause
# -------------------------------------------------------------
+
+
import os
from tempfile import NamedTemporaryFile
+from typing import List
from mergedictionaries.sources import GitRepository
+from mergedictionaries.utils.collections import remove_words
def build_temporary_dictionary(words):
fd = NamedTemporaryFile(delete=False)
for word in words:
fd.write(f"{word}\n".encode("utf-8"))
fd.close()
return fd.name
def write(words, target_repos, cached_repos):
if not target_repos:
return
tmp_dictionary_path = build_temporary_dictionary(words)
for repo in target_repos:
GitRepository(repo, cached_repos).publish(tmp_dictionary_path)
os.unlink(tmp_dictionary_path)
+
+
+def delete_words(repo: GitRepository, words_to_delete: List):
+ current_words = repo.extract_words()
+
+ if not any(word in current_words for word in words_to_delete):
+ # Nothing to do, the dictionary is already up to date.
+ return
+
+ words = remove_words(current_words, words_to_delete)
+
+ tmp_dictionary_path = build_temporary_dictionary(words)
+ repo.publish(tmp_dictionary_path)
+
+ os.unlink(tmp_dictionary_path)
diff --git a/src/mergedictionaries/write/jetbrains.py b/src/mergedictionaries/write/jetbrains.py
index 5bcdb93..64f5e2e 100644
--- a/src/mergedictionaries/write/jetbrains.py
+++ b/src/mergedictionaries/write/jetbrains.py
@@ -1,21 +1,37 @@
# -------------------------------------------------------------
# Merge dictionaries :: Publishers :: JetBrains IDEs
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Project: Nasqueron
# Description: Find application-level dictionaries
# from JetBrains IDEs
# License: BSD-2-Clause
# -------------------------------------------------------------
from mergedictionaries.sources import jetbrains as jetbrains_source
from mergedictionaries.output import jetbrains as jetbrains_output
+from mergedictionaries.utils.collections import remove_words
def write(words):
contents = jetbrains_output.dump(words)
for file_path in jetbrains_source.find_application_level_dictionaries():
with open(file_path, "w") as fd:
fd.write(contents)
fd.write("\n")
+
+
+def delete_words(file_path, words_to_delete):
+ current_words = jetbrains_source.extract_words(file_path)
+
+ if not any(word in current_words for word in words_to_delete):
+ # Nothing to do, the dictionary is already up to date.
+ return
+
+ words = remove_words(current_words, words_to_delete)
+
+ contents = jetbrains_output.dump(words)
+ with open(file_path, "w") as fd:
+ fd.write(contents)
+ fd.write("\n")
File Metadata
Details
Attached
Mime Type
text/x-diff
Expires
Wed, Oct 22, 13:35 (1 d, 31 m)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3092260
Default Alt Text
(16 KB)
Attached To
Mode
rMD Merge dev dictionaries
Attached
Detach File
Event Timeline
Log In to Comment