Table of Contents


Zotero CLI Tweet

Sort and rank your Zotero references easy from your CLI.
Ask questions to your Zotero documents with GPT locally.

PyPi Platform Python Versions Build Status Read The Docs Known Vulnerabilities DOI License

This Tinyscript tool relies on pyzotero for communicating with Zotero’s Web API. It allows to list field values, show items in tables in the CLI or also export sorted items to an Excel file.

$ pip install zotero-cli-tool

Quick Start

The first time you start it, the tool will ask for your API identifier and key. It will cache it to ~/.zotero/creds.txt with permissions set to rw for your user only. Data is cached to ~/.zotero/cache/. If you are using a shared group library, you can either pass the “-g” (“--group”) option in your zotero-cli command or, for setting it permanently, touch an empty file ~/.zotero/group.

  • Manually update cached data
$ zotero-cli reset

Note that it could take a while. That’s why caching is interesting for further use.

  • Count items in a collection
$ zotero-cli count --filter "collections:biblio"
123
  • List values for a given field
$ zotero-cli list itemType

    Type             
    ----             
    computer program 
    conference paper 
    document         
    journal article  
    manuscript       
    thesis           
    webpage          

  • Show entries with the given set of fields, filtered based on multiple critera and limited to a given number of items
$ zotero-cli show year title itemType numPages --filter "collections:biblio" --filter "title:detect" --limit ">date:10"

    Year  Title                                                                                                                             Type              #Pages 
    ----  -----                                                                                                                             ----              ------ 
    2016  Classifying Packed Programs as Malicious Software Detected                                                                        conference paper  3      
    2016  Detecting Packed Executable File: Supervised or Anomaly Detection Method?                                                         conference paper  5      
    2016  Entropy analysis to classify unknown packing algorithms for malware detection                                                     conference paper  21     
    2017  Packer Detection for Multi-Layer Executables Using Entropy Analysis                                                               journal article   18     
    2018  Sensitive system calls based packed malware variants detection using principal component initialized MultiLayers neural networks  journal article   13     
    2018  Effective, efficient, and robust packing detection and classification                                                             journal article   15     
    2019  Efficient automatic original entry point detection                                                                                journal article   14     
    2019  All-in-One Framework for Detection, Unpacking, and Verification for Malware Analysis                                              journal article   16     
    2020  Experimental Comparison of Machine Learning Models in Malware Packing Detection                                                   conference paper  3      
    2020  Building a smart and automated tool for packed malware detections using machine learning                                          thesis            99     

  • Export entries
$ zotero-cli export year title itemType numPages --filter "collections:biblio" --filter "title:detect" --limit ">date:10"
$ file export.xlsx 
export.xlsx: Microsoft Excel 2007+

  • Use a predefined query
$ zotero-cli show - --query "top-50-most-relevants"

Note: “-” is used for the field positional argument to tell the tool to select the predefined list of fields included in the query.

This is equivalent to:

$ zotero-cli show year title numPages itemType --limit ">rank:50"

Available queries:

  • no-attachment: list of all items with no attachment ; displayed fields: title
  • no-url: list of all items with no URL ; displayed fields: year, title
  • top-10-most-relevants: top-10 best ranked items ; displayed fields: year, title, numPages, itemType
  • top-50-most-relevants: same as top-10 but with the top-50

Mark items:

$ zotero-cli mark read --filter "title:a nice paper"
$ zotero-cli mark unread --filter "title:a nice paper"

Markers:

  • read / unread: by default, items are displayed in bold ; marking an item as read will make it display as normal
  • irrelevant / relevant: this allows to exclude a result from the output list of items
  • ignore / unignore: this allows to completely ignore an item, including in the ranking algorithm

Local GPT

This feature is based on PrivateGPT. It can be used to ingest local Zotero documents and ask questions based on a chosen GPT model.

  • Install optional dependencies
$ pip install zotero-cli-tool[gpt]
  • Install a model among the followings:

    • ggml-gpt4all-j-v1.3-groovy.bin (default)
    • ggml-gpt4all-l13b-snoozy.bin
    • ggml-mpt-7b-chat.bin
    • ggml-v3-13b-hermes-q5_1.bin
    • ggml-vicuna-7b-1.1-q4_2.bin
    • ggml-vicuna-13b-1.1-q4_2.bin
    • ggml-wizardLM-7B.q4_2.bin
    • ggml-stable-vicuna-13B.q4_2.bin
    • ggml-mpt-7b-base.bin
    • ggml-nous-gpt4-vicuna-13b.bin
    • ggml-mpt-7b-instruct.bin
    • ggml-wizard-13b-uncensored.bin
$ zotero-cli install

The latest installed model gets selected for the ask command (see hereafter).

  • Ingest your documents
$ zotero-cli ingest
  • Ask questions to your documents
$ zotero-cli ask
Using embedded DuckDB with persistence: data will be stored in: /home/morfal/.zotero/db
Found model file.
[...]
Enter a query: 

Special Features

Some additional fields can be used for listing/filtering/showing/exporting data.

  • Computed fields

    • authors: the list of creators with creatorType equal to author
    • citations: the number of relations the item has to other items with a later date
    • editors: the list of creators with creatorType equal to editor
    • numAttachments: the number of child items with itemType equal to attachment
    • numAuthors: the number of creators with creatorType equal to author
    • numCreators: the number of creators
    • numEditors: the number of creators with creatorType equal to editor
    • numNotes: the number of child items with itemType equal to note
    • numPages: the (corrected) number of pages, either got from the original or pages field
    • references: the number of relations the item has to other items with an earlier date
    • year: the year coming from the datetime parsing of the date field
  • Extracted fields (from the extra field)

    • comments: custom field for adding comments
    • results: custom field for mentioning results related to the item
    • what: custom field for a short description of what the item is about
    • zscc: number of Scholar citations, computed with the Zotero Google Scholar Citations plugin
  • PageRank-based reference ranking algorithm

    • rank: computed field aimed to rank references in order of relevance ; this uses an algorithm similar to Google’s PageRank while weighting references in function of their year of publication (giving more importance to recent references, which cannot have as much citations as older references anyway)