Table of Contents
DataSet File Format
Store a dataset in XSLX-like format.
This library contains code for handling the DataSet File Format (DSFF) based on the XSLX format and for converting it to ARFF (for use with the Weka framework), CSV or a FilelessDataset structure (from the Packing Box).
pip install --user dsff
Usage
Creating a DSFF from a FilelessDataset
>>> import dsff
>>> with dsff.DSFF() as f:
f.write("/path/to/my-dataset") # folder of a FilelessDataset (containing data.csv, features.json and metadata.json)
f.to_arff() # creates ./my-dataset.arff
f.to_csv() # creates ./my-dataset.csv
# while leaving the context, ./my-dataset.dsff is created
Creating a FilelessDataset from a DSFF
>>> import dsff
>>> with dsff.DSFF("/path/to/my-dataset.dsff") as f:
f.to_dataset() # creates ./[dsff-title] with data.csv, features.json and metadata.json
Related Projects
You may also like these:
- Awesome Executable Packing: A curated list of awesome resources related to executable packing.
- Bintropy: Analysis tool for estimating the likelihood that a binary contains compressed or encrypted bytes (inspired from this paper).
- Dataset of packed ELF files: Dataset of ELF samples packed with many different packers.
- Dataset of packed PE files: Dataset of PE samples packed with many different packers (fork of this repository).
- Docker Packing Box: Docker image gathering packers and tools for making datasets of packed executables.
- PEiD: Python implementation of the well-known Packed Executable iDentifier (PEiD).
- PyPackerDetect: Packing detection tool for PE files (fork of this repository).
- REMINDer: Packing detector using a simple heuristic (inspired from this paper).