Page MenuHomeDevCentral

No OneTemporary

diff --git a/README.md b/README.md
index b7a3221..a4a8719 100644
--- a/README.md
+++ b/README.md
@@ -1,133 +1,133 @@
# Merge dictionaries
## Root problem
You uses everal IDEs and each maintain its own spelling dictionary.
You want to merge them so words from PyCharm are available in PhpStorm too.
## Usage
### Merge all dictionaries
To discover dictionaries in your computer, extract words and merge them:
```shell
$ merge-dictionaries --merge
```
This is a potentially destructive operation:
your dictionary files will be overwritten.
### Extract dictionaries words
To print all the words:
```shell
$ merge-dictionaries --extract
```
This is a safe operation.
### Build an Hunspell-compatible dictionary
To create a personal dictionary file for your Hunspell dictionary:
```shell
$ merge-dictionaries --extract > $HOME/.hunspell_default
```
This is a safe read-only operation for your IDE files. This can
overwrite your default Hunspell dictionary if it already exists.
### Build a dictionary in a IDE specific format
You can specify `--format=<format>` as argument to the extract task:
```shell
$ merge-dictionaries --extract --format=JetBrains
```
It will output a dictionary file you can use in any IDE compatible with that format.
This is a safe read-only operation.
### Sync with a Git repository
Create a `$HOME/.config/merge-dictionaries.conf` with the following content:
```yaml
git:
- git@github.com:luser/dictionary.git
```
See below if you wish to host the Git repository locally.
+### Delete words from a dictionary
+
+Now your dictionaries are synced, it can be tricky to delete words from them,
+as the next sync will overwrite them and restore words you removed if still in
+the Git repository or one local fi
+le.
+
+If you added a word in a dictionary, you can delete it:
+
+```shell
+$ merge-dictionaries --delete-words <word> [word ...]
+```
+
+This is a potentially destructive operation:
+your dictionary files will be overwritten.
+
## IDE support
Currently, the following IDEs are supported
* All JetBrains IDEs: application-level dictionary
* Hunspell: read personal dictionaries
* Git repository
## Extend the code
### How to add an IDE?
To add an IDE, you need to provide the following methods:
* sources
* a list of paths candidates for the IDE dictionary
* a method extracting words from the dictionary
* output
* a method to dump the extracted words in the IDE format
* write
* a method to save the files, normally you can call the ones created
+ * a method to rewrite a file with a list of words, so delete works too
### How can I contribute?
You can commit your changes to the upstream by following instructions at https://agora.nasqueron.org/How_to_contribute_code
The canonical repository is https://devcentral.nasqueron.org/source/merge-dictionaries.git
## FAQ
-### Delete a word
-
-Not yet implemented. Here a proposal to implement this.
-
-Curently, the workflow is:
-
-[ extract ] -> { words } -> [ publish ]
-
-You want to add a new transformation step:
-
-[ extract ] -> { words } -> [ transform ] -> { words cleaned up } -> [ publish ]
-
-Add a transform step with an allowlist of the words to remove.
-
-It's not easy to detect if the user has removed a word explicitly
-from a dictionary, as we don't cache extracted words.
-
### Host locally the Git repository
If you want to host the repository on your local machine, use a bare repository:
```shell
$ git init --bare ~/.cache/dictionary
Initialized empty Git repository in /usr/home/luser/.cache/dictionary/
```
You can push to a bare repository, but non-bare ones are protected against pushes,
to avoid a desync between your index and the working files.
Alternatively, you can prepare a script to do this sequence of operation:
```shell
$ merge-dictionaries --merge
$ cd ~/.cache/dictionary
$ git reset
```
## License
BSD-2-Clause, see [LICENSE](LICENSE) file.
diff --git a/setup.cfg b/setup.cfg
index 76bac5e..04b207f 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -1,33 +1,33 @@
[metadata]
name = merge-dictionaries
-version = 0.3.0
+version = 0.4.0
author = Sébastien Santoro
author_email = dereckson@espace-win.org
description = Merge dictionaries
long_description = file: README.md
long_description_content_type = text/markdown
license = BSD-2-Clause
license_files = LICENSE
url = https://devcentral.nasqueron.org/source/merge-dictionaries/
project_urls =
Bug Tracker = https://devcentral.nasqueron.org/tag/development_tools/
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: BSD License
Operating System :: OS Independent
Environment :: Console
Intended Audience :: Developers
Topic :: Software Development :: Build Tools
[options]
package_dir =
= src
packages = find:
scripts =
bin/merge-dictionaries
python_requires = >=3.7
install_requires =
PyYAML>=6.0,<7.0
[options.packages.find]
where = src
diff --git a/src/mergedictionaries/actions/__init__.py b/src/mergedictionaries/actions/__init__.py
new file mode 100644
index 0000000..49b79cd
--- /dev/null
+++ b/src/mergedictionaries/actions/__init__.py
@@ -0,0 +1,9 @@
+# -------------------------------------------------------------
+# Merge dictionaries :: Actions
+# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+# Project: Nasqueron
+# License: BSD-2-Clause
+# -------------------------------------------------------------
+
+
+from .delete_words import DeleteAction
diff --git a/src/mergedictionaries/actions/delete_words.py b/src/mergedictionaries/actions/delete_words.py
new file mode 100644
index 0000000..ace530c
--- /dev/null
+++ b/src/mergedictionaries/actions/delete_words.py
@@ -0,0 +1,46 @@
+#!/usr/bin/env python3
+
+# -------------------------------------------------------------
+# Delete words from all dictionaries
+# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+# Project: Nasqueron
+# Description: Delete words from all dictionaries from
+# all currently found sources.
+# License: BSD-2-Clause
+# -------------------------------------------------------------
+
+
+from mergedictionaries import sources, write
+from mergedictionaries.sources import GitRepository
+
+
+class DeleteAction:
+ def __init__(self, context):
+ self.git_config = context["config"].get("git", [])
+
+ if "git" not in context:
+ context["git"] = []
+ self.git_cached_repos = context["git"]
+
+ def run(self, words):
+ for source in self.get_sources():
+ for arg in source["query"]():
+ source["delete"](words, arg)
+
+ def get_sources(self):
+ return [
+ {
+ "query": sources.jetbrains.find_application_level_dictionaries,
+ "delete": lambda words, file: write.jetbrains.delete_words(file, words),
+ },
+ {
+ "query": self.query_git_repositories,
+ "delete": lambda words, repo: write.git.delete_words(repo, words),
+ },
+ ]
+
+ def query_git_repositories(self):
+ return [
+ GitRepository(git_repo_url, self.git_cached_repos)
+ for git_repo_url in self.git_config
+ ]
diff --git a/src/mergedictionaries/app/app.py b/src/mergedictionaries/app/app.py
index db8bbd5..fd4cd90 100644
--- a/src/mergedictionaries/app/app.py
+++ b/src/mergedictionaries/app/app.py
@@ -1,161 +1,173 @@
#!/usr/bin/env python3
# -------------------------------------------------------------
# Merge dictionaries
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Project: Nasqueron
# Description: Merge dictionaries from various sources,
# mainly IDEs, and allow to propagate them.
# License: BSD-2-Clause
# -------------------------------------------------------------
import argparse
import os
import sys
import yaml
from mergedictionaries import write, output, sources
+from mergedictionaries.actions import DeleteAction
# -------------------------------------------------------------
# Extract words
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def get_dictionary_formatters():
return {
"JetBrains": output.jetbrains.dump,
}
# -------------------------------------------------------------
# Configuration
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def get_configuration_path():
return os.environ["HOME"] + "/.config/merge-dictionaries.conf"
def parse_configuration():
try:
with open(get_configuration_path()) as fd:
return yaml.safe_load(fd) or {}
except OSError:
return {}
# -------------------------------------------------------------
# Application entry point
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def parse_arguments():
parser = argparse.ArgumentParser(description="Merge dictionaries.")
parser.add_argument(
"--extract",
action="store_const",
dest="task",
const="extract",
help="Extract all words from found dictionaries",
)
parser.add_argument(
"--format", action="store", help="Specifies the output format", default="text"
)
parser.add_argument(
"--merge",
action="store_const",
dest="task",
const="merge",
help="Merge all found dictionaries",
)
+ parser.add_argument(
+ "-D",
+ "--delete-words",
+ metavar="word",
+ dest="delete_words",
+ nargs="+",
+ help="Delete one or more words from the dictionaries",
+ )
+
return parser.parse_args()
class Application:
def __init__(self):
self.context = {"git": {}}
def run(self):
args = parse_arguments()
- if args.task is None:
+ task = "delete" if args.delete_words else args.task
+ if task is None:
print("No task has been specified.", file=sys.stderr)
sys.exit(1)
self.context["config"] = parse_configuration()
self.context["args"] = args
- if args.task == "extract":
+ if task == "extract":
self.run_extract_all_words(args.format)
- elif args.task == "merge":
+ elif task == "merge":
self.run_merge()
+ elif task == "delete":
+ action = DeleteAction(self.context)
+ action.run(args.delete_words)
+
+ self.on_exit()
+ sys.exit(0)
def get_dictionary_writers(self):
return [
lambda words: write.jetbrains.write(words),
lambda words: write.git.write(
words, self.context["config"].get("git", []), self.context["git"]
),
]
def run_merge(self):
words = self.extract_all_words()
for method in self.get_dictionary_writers():
method(words)
- self.on_exit()
-
def get_words_sources(self):
return [
lambda: sources.git.extract_words_from_all_dictionaries(
self.context["config"].get("git", []), self.context["git"]
),
lambda: sources.jetbrains.extract_words_from_all_dictionaries(),
lambda: sources.hunspell.extract_words_from_all_dictionaries(),
]
def extract_all_words(self):
return sorted(
{word for method in self.get_words_sources() for word in method()}
)
def run_extract_all_words(self, words_format):
words = self.extract_all_words()
# Trivial case
if words_format == "text" or words_format == "hunspell":
if words_format == "hunspell":
print(len(words))
for word in words:
print(word)
- self.on_exit()
- sys.exit(0)
+ return
# We need a specific formatter
formatters = get_dictionary_formatters()
if words_format not in formatters:
print(f"Unknown format: {words_format}", file=sys.stderr)
self.on_exit()
sys.exit(2)
print(formatters[words_format](words))
- self.on_exit()
- sys.exit(0)
def on_exit(self):
"""Events to run before exiting to cleanup resources."""
- sources.git.on_exit(self.context["git"])
+ sources.git.on_exit(self.context.get("git", {}))
def run():
app = Application()
app.run()
diff --git a/src/mergedictionaries/utils/collections.py b/src/mergedictionaries/utils/collections.py
new file mode 100644
index 0000000..4e59383
--- /dev/null
+++ b/src/mergedictionaries/utils/collections.py
@@ -0,0 +1,31 @@
+# -------------------------------------------------------------
+# Merge dictionaries :: Utilities :: Collections
+# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+# Project: Nasqueron
+# Description: Helper functions for lists
+# License: BSD-2-Clause
+# -------------------------------------------------------------
+
+
+from typing import List
+
+
+def remove_words(current_words: List, words_to_delete: List) -> List:
+ """
+ Removes specified words from a list of words.
+
+ Parameters:
+ current_words: list
+ The list of words from which specified words are to be removed.
+ words_to_delete: list
+ The list containing words that need to be removed.
+
+ Returns:
+ list
+ A new list containing words from the current_words list that are not
+ present in the words_to_delete list.
+ """
+ words = list(set(current_words) - set(words_to_delete))
+ words.sort()
+
+ return words
diff --git a/src/mergedictionaries/write/git.py b/src/mergedictionaries/write/git.py
index ff9873e..ec7ba7d 100644
--- a/src/mergedictionaries/write/git.py
+++ b/src/mergedictionaries/write/git.py
@@ -1,33 +1,52 @@
# -------------------------------------------------------------
# Merge dictionaries :: Publishers :: Git repository
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Project: Nasqueron
# Description: Find application-level dictionaries
# from Git repository
# License: BSD-2-Clause
# -------------------------------------------------------------
+
+
import os
from tempfile import NamedTemporaryFile
+from typing import List
from mergedictionaries.sources import GitRepository
+from mergedictionaries.utils.collections import remove_words
def build_temporary_dictionary(words):
fd = NamedTemporaryFile(delete=False)
for word in words:
fd.write(f"{word}\n".encode("utf-8"))
fd.close()
return fd.name
def write(words, target_repos, cached_repos):
if not target_repos:
return
tmp_dictionary_path = build_temporary_dictionary(words)
for repo in target_repos:
GitRepository(repo, cached_repos).publish(tmp_dictionary_path)
os.unlink(tmp_dictionary_path)
+
+
+def delete_words(repo: GitRepository, words_to_delete: List):
+ current_words = repo.extract_words()
+
+ if not any(word in current_words for word in words_to_delete):
+ # Nothing to do, the dictionary is already up to date.
+ return
+
+ words = remove_words(current_words, words_to_delete)
+
+ tmp_dictionary_path = build_temporary_dictionary(words)
+ repo.publish(tmp_dictionary_path)
+
+ os.unlink(tmp_dictionary_path)
diff --git a/src/mergedictionaries/write/jetbrains.py b/src/mergedictionaries/write/jetbrains.py
index 5bcdb93..64f5e2e 100644
--- a/src/mergedictionaries/write/jetbrains.py
+++ b/src/mergedictionaries/write/jetbrains.py
@@ -1,21 +1,37 @@
# -------------------------------------------------------------
# Merge dictionaries :: Publishers :: JetBrains IDEs
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Project: Nasqueron
# Description: Find application-level dictionaries
# from JetBrains IDEs
# License: BSD-2-Clause
# -------------------------------------------------------------
from mergedictionaries.sources import jetbrains as jetbrains_source
from mergedictionaries.output import jetbrains as jetbrains_output
+from mergedictionaries.utils.collections import remove_words
def write(words):
contents = jetbrains_output.dump(words)
for file_path in jetbrains_source.find_application_level_dictionaries():
with open(file_path, "w") as fd:
fd.write(contents)
fd.write("\n")
+
+
+def delete_words(file_path, words_to_delete):
+ current_words = jetbrains_source.extract_words(file_path)
+
+ if not any(word in current_words for word in words_to_delete):
+ # Nothing to do, the dictionary is already up to date.
+ return
+
+ words = remove_words(current_words, words_to_delete)
+
+ contents = jetbrains_output.dump(words)
+ with open(file_path, "w") as fd:
+ fd.write(contents)
+ fd.write("\n")

File Metadata

Mime Type
text/x-diff
Expires
Wed, Oct 22, 13:35 (1 d, 3 h)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3092260
Default Alt Text
(16 KB)

Event Timeline