Release notes
Semantic Versioning
rows uses [semantic versioning][semver]. Note that it means we do not
guarantee API backwards compatibility on 0.x.y versions (but we try the best
to).
Version 0.4.2dev0
Released on: (in development)
General Changes and Enhancements
export_to_htmlis now available even iflxmlis not installed- Add Jupyter Notebook integration (implements
_repr_html_,.headand.tail) - Fix code to remove some warnings
- Add support to read compressed files directly (like in
rows.import_from_csv("filename.csv.gz")) rows.Tablenow returns a new table when sliced- Remove functions
export_dataandget_filename_and_fobj(the newSourceimplements the features better).
Plugins
- Add param
max_rowstocreate_table(import only part of a table, all plugins are supported) - Add
start_row,end_row,start_columnandend_columnto ODS plugin - Prevent
xlrd(XLS plugin) from printing wrong sector size warning ("WARNING *** file size (551546) not 512 + multiple of sector size (512)") - Set
rows.Tablename (table.meta["name"]) for ODS, XLS and XLSX plugins - Add option to set
<caption>tag inexport_to_html - Use correct table name when exporting to PostgreSQL
- Carefully close all fobjs in pgimport/pgexport
- Added CSV dialect "excel-semicolon"
- Improved PostgreSQL import from CSV (pgimport) when dealing with null values
- PDF now supports
page_numbersas string (range of numbers) - Add support to exporto to multiple sheets on the same XLSX file
Command-Line Interface
rows schemais now "lazy" (before it imported the whole file, even if samples were defined)- Add support for compressed files output on
rows pdf-to-textandrows schema - HTTP cache is disabled by default (this may change in the future)
- Accept URI schemes in
rows convert rows convertnow supports compressed filesrows pgexportnow accepts query instead of table name (useful for selecting from a view since\copycannot use a view but can use a query instead of a table name).- Detect input encoding whenever possible
- Add
--quietto some commands (fixprogress) - Add plugins' input/output options to
convert - Add
rows csv-merge(lazily merge CSV files even if they don't share a common schema) - Add
rows csv-clean(lazily clean a CSV file, removing empty columns and creating a consistent output format) - Add
rows list-sheets(prints sheet names for ODS, XLS and XLSX files)
Utils
- Add support for CSV format on schema export
- Use dataclasses to describe Source
import_from_sourcenow supports compressed files (and so all CLI commands)- Add support for passing a
contexttoload_schema
Bug Fixes
- #314 rows pgimport fails if using --schema
- #309 Fix file-magic detection
- #320 Get correct data if ODS spreadsheet has empty cells
- Fix slug function (so
"a/b"will turn into"a_b") - Detect as fallback type if all values are empty
- Fix output on
rows schema(was printing to stdout even if output file is provided) - Fix
rows schema(some output formats where not working properly)
Version 0.4.1 (bugfix release)
Released on: 2019-02-14
General Changes and Enhancements
- Add new way to make docs (remove sphinx and uses mkdocs + click-man + pycco)
- Update Dockerfile
Bug Fixes
- #305 "0" was not being
deserialized by
IntegerField
Version 0.4.0
Released on: 2019-02-09
General Changes and Enhancements
- #243 Change license to LGPL3.0.
- Added official Python 3.6 support.
Table.__add__does not depend on table sizes anymore.- Implemented
Table.__iadd__(table += otherwill work). - #234 Remove
BinaryFieldfrom the default list of detection types.
Plugins
- #224 Add
|as possible delimiter (CSV dialect detection). - Export CSV in batches.
- Change CSV dialect detection sample size to 256KiB.
- #225 Create export callbacks (CSV and SQLite plugins).
- #270 Added options to export pretty text table frames (TXT plugin).
- #274
start_rowandstart_columnnow behave the same way in XLS and XLSX (starting from 0). - #261 Add support to
end_rowandend_columnon XLS and XLSX (thanks @Lrcezimbra for the suggestion). - #4 Add PostgreSQL plugin (thanks to @juliano777).
- #290 Fix percent formatting reading on XLSX and ODS file formats (thanks to @jsbueno).
- #220 Do not use non-import_fields and force_types columns on type detection algorithm.
- #50 Create PDF extraction plugin
with two backend libraries (
pymupdfandpdfminer.six) and 3 table extraction algorithms. - #294 Decrease XLSX reading time (thanks to @israelst).
- Change to pure Python version of Apache Thrift library (parquet plugin)
- @299 Change CSV field limit
Command-Line Interface
- #242 Add
--fields/--fields-excludetoconvert,joinandsum(and rename--fields-excludeonprint), also remove--fieldsfromquery(is not needed). - #235 Implement
--http-cacheand--http-cache-path. - #237 Implement
rows schema(generates schema in text, SQL and Django models). - Enable progress bar when downloading files.
- Create
pgimportandpgexportcommands. - Create
csv-to-sqliteandsqlite-to-csvcommands. - Create
pdf-to-textcommand. - Add shortcut for all command names:
2can be used instead of-to-(sorows pdf2textis a shortcut torows pdf-to-text).
Utils
- Create
utils.open_compressedhelper function: can read/write files, automatically dealing with on-the-fly compression. - Add progress bar support to
utils.download_file(thanks totqdmlibrary). - Add helper class
utils.CsvLazyDictWriter(write asdicts without needing to pass the keys in advance). - Add
utils.pgimportandutils.pgexportfunctions. - Add
utils.csv2sqliteandutils.sqlite2csvfunctions.
Bug Fixes
- #223
UnicodeDecodeErroron dialect detection. - #214 Problem detecting dialect.
- #181 Create slugs inside
Table.__init__. - #221 Error on
pip install rows. - #238
import_from_dictssupports generator as input - #239 Use correct field ordering
- #299 Integer field detected for numbers started with zero
Version 0.3.1
Released on: 2017-05-08
Enhancements
- Move information on README to a site, organize and add more examples. Documentation is available at turicas.info/rows. Thanks to @ellisonleao for Sphinx implementation and @ramiroluz for new examples.
- Little code refactorings.
Bug Fixes
- #200 Escape output when exporting to HTML (thanks to @arloc)
- Fix some tests
- #215 DecimalField does not handle negative values correctly if using locale (thanks to @draug3n for reporting)
Version 0.3.0
Released on: 2016-09-02
Backwards Incompatible Changes
Bug Fixes
- Return
Noneon XLS blank cells; - #188 Change
sample_sizeon encoding detection.
Enhancements and Refactorings
rows.fields.detect_fieldswill considerBinaryFieldif all the values arestr(Python 2)/bytes(Python 3) and all other fields will work only withunicode(Python 2)/str(Python 3);- Plugins HTML and XPath now uses a better way to return inner HTML (when
preserve_html=True); - #189 Optimize
Table.__add__.
New Features
- Support for Python 3 (finally!);
rows.fields.BinaryFieldnow automatically uses base64 to encode/decode;- Added
encodinginformation torows.Tablemetadata in text plugins; - Added
sheet_nameinformation torows.Tablemetadata in XLS and XLSX plugins; - #190 Add
query_argstoimport_from_sqlite; - #177 Add
dialecttoexport_to_csv.
Version 0.2.1
Released on: 2016-08-10
Backwards Incompatible Changes
rows.utils.export_to_urisignature is now likerows.export_to_*(first therows.Tableobject, then the URI)- Changed default table name in
import_from_sqliteandexport_to_sqlite(fromrowsandrows_{number}totable{number})
Bug Fixes
- #170 (SQLite plugin) Error
converting
intandfloatwhen value isNone. - #168 Use
Field.serializeif does not know the field type (affecting: XLS, XLSX and SQLite plugins). - #167 Use more data to detect dialect, delimit the possible delimiters and fallback to excel if can't detect.
- #176 Problem using quotes on CSV plugin.
- #179 Fix double underscore
problem on
rows.utils.slug - #175 Fix
Noneserialization/deserialization in all plugins (and also field types) - #172 Expose all tables in
rows queryfor SQLite databases - Fix
examples/cli/convert.sh(missing-) - Avoids SQL injection in table name
Enhancements and Refactorings
- Refactor
rows.utils.import_from_uri - Encoding and file type are better detected on
rows.utils.import_from_uri - Added helper functions to
rows.utilsregarding encoding and file type/plugin detection - There's a better description of plugin metadata (MIME types accepted) on
rows.utils(should be refactored to be inside each plugin) - Moved
slugandipartitionfunctions torows.plugins.utils - Optimize
rows querywhen using only one SQLite source
Version 0.2.0
Released on: 2016-07-15
Backwards Incompatible Changes
rows.fields.UnicodeFieldwas renamed torows.fields.TextFieldrows.fields.BytesFieldwas renamed torows.fields.BinaryField
Bug Fixes
- Fix import errors on older versions of urllib3 and Python (thanks to @jeanferri)
- #156
BoolFieldshould not accept "0" and "1" as possible values - #86 Fix
Content-Typeparsing - Fix locale-related tests
- #85 Fix
preserve_htmliffieldsis not provided - Fix problem with big integers
- #131 Fix problem when empty sample data
- Fix problem with
unicodeandDateField - Fix
PercentField.serialize(None) - Fix bug with
Decimalreceiving'' - Fix bug in
PercentField.serialize(Decimal('0')) - Fix nested table behaviour on HTML plugin
General Changes
- (EXPERIMENTAL) Add
rows.FlexibleTableclass (with help on tests from @maurobaraildi) - Lots of refactorings
- Add
rows.operations.transpose - Add
Table.__repr__ - Renamte
rows.fields.UnicodeFieldtorows.fields.TextFieldandrows.fields.ByteFieldtorows.fields.BinaryField - Add a man page (thanks to @kretcheu)
- #40 The package is available on Debian!
- #120 The package is available on Fedora!
- Add some examples
- #138 Add
rows.fields.JSONField - #146 Add
rows.fields.EmailField - Enhance encoding detection using file-magic library
- #160 Add
support for column get/set/del in
rows.Table
Tests
- Fix "\r\n" on tests to work on Windows
- Enhance tests with
mockto assure some functions are being called - Improve some tests
Plugins
- Add plugin JSON (thanks @sxslex)
- #107 Add
import_from_txt - #149 Add
import_from_xpath - (EXPERIMENTAL) Add
import_from_ods - (EXPERIMENTAL) Add
import_from_parquet - Add
import_from_sqliteandexport_to_sqlite(implemented by @turicas with help from @infog) - Add
import_from_xlsxandexport_to_xlsx(thanks to @RhenanBartels) - Autodetect delimiter in CSV files
- Export to TXT, JSON and XLS also support an already opened file and CSV can export to memory (thanks to @jeanferri)
- #93 Add HTML helpers inside
rows.plugins.html:count_tables,extract_text,extract_linksandtag_to_dict - #162 Add
import_from_dictsandexport_to_dicts - Refactor
export_to_txt
Utils
- Create
rows.plugins.utils - #119 Rename field name if name is duplicated (to "field_2", "field_3", ..., "field_N") or if starts with a number.
- Add option to import only some fields (
import_fieldsparameter insidecreate_table) - Add option to export only some fields (
export_fieldsparameter insideprepare_to_export) - Add option
force_typesto force field types in some columns (instead of detecting) oncreate_table. - Support lazy objects on
create_table - Add
samplesparameter tocreate_table
CLI
- Add option to disable SSL verification (
--verify-ssl=no) - Add
printcommand - Add
--version - CLI is not installed by default (should be installed as
pip install rows[cli]) - Automatically detect default encoding (if not specified)
- Add
--order-byto some commands and removesortcommand. #111 - Do not use locale by default
- Add
querycommand: converts (from many sources) internally to SQLite, execute the query and then export
Version 0.1.1
Released on: 2015-09-03
- Fix code to run on Windows (thanks @sxslex)
- Fix locale (name, default name etc.)
- Remove
filemagicdependency (waiting forpython-magicto be available on PyPI) - Write log of changes for
0.1.0and0.1.1
Version 0.1.0
Released on: 2015-08-29
- Implement
Tableand its basic methods - Implement basic plugin support with many utilities and the following formats:
csv(input/output)html(input/output)txt(output)xls(input/output)- Implement the following field types - many of them with locale support:
ByteFieldBoolFieldIntegerFieldFloatFieldDecimalFieldPercentFieldDateFieldDatetimeFieldUnicodeField- Implement basic
Tableoperations: sumjointransformserialize- Implement a command-line interface with the following commands:
convertjoinsortsum- Add examples to the repository