CSV module¶

This module implements the CSV (Comma-Separated Values) format, partly specified by RFC 4180.

Note

To use the features of this module, you have to put a special import statement at the begining of your LSP file: use csv;

Module functions¶

csv.parse(reader)¶

csv.parse(reader, options)

csv.parse(filename)

csv.parse(filename, options)

csv.parse(filename, charset)

csv.parse(filename, charset, options)

Reads the CSV file and returns a CsvContent. You can provide the filename to parse or you can directly provide the stream to read, previously opened with io.openRead(). When using a filename, you can also specify the encoding. If no encoding is provided, ISO-8859-1 is assumed.

Several options can be used to customize the behavior of the parser. These options must be specified in a map. The supported options are detailed at the end of this page.

Parameters:

filename (string) – Path to the CSV file to convert.
stream (streamreader) – Stream previously opened with io.openRead().
charset (charset) – Encoding used to convert bytes to characters.
options (map) – Optional parameters to customize the behavior of the parser.

Return type:

CsvContent

csv.deserialize(content)¶

csv.deserialize(content, options)

Identical to parse() but the CSV content is taken from a string instead of a file.

Parameters:

content (string) – CSV content to parse.
options (map) – Optional parameters to customize the behavior of the parser.

Return type:

CsvContent

csv.read(reader)¶

csv.read(reader, options)

csv.read(filename)

csv.read(filename, options)

csv.read(filename, charset)

csv.read(filename, charset, options)

Opens the CSV file and returns a CsvReader useful to read the file line by line.

Similarly to the parser method, you can provide the filename to parse or you can directly provide the stream to read, previously opened with io.openRead(). When using a filename, you can also specify the encoding. If no encoding is provided, ISO-8859-1 is assumed.

Several options can be used to customize the behavior of the parser. These options must be specified in a map. The supported options are detailed at the end of this page.

Parameters:

filename (string) – Path to the CSV file to convert.
stream (streamreader) – Stream previously opened with io.openRead().
charset (charset) – Encoding used to convert bytes to characters.
options (map) – Optional parameters to customize the behavior of the parser.

Return type:

CsvReader

csv.readString(content)¶

csv.readString(content, options)

Identical to read() but the CSV content is taken from a string instead of a file.

Parameters:

content (string) – CSV content to parse.
options (map) – Optional parameters to customize the behavior of the parser.

Return type:

CsvReader

Classes¶

class CsvContent¶

nbRows¶

Returns the number of rows.

Return type:: int

nbCols¶

Returns the number of columns.

Return type:: int

colNames¶

Returns the columns names. If no column names were found or provided, nil is returned.

Return type:: Map (array of strings) or nil

cols¶

Returns all columns as a map, indexed by column number. For each column, rows are indexed by number.

Return type:: Map

colsByName¶

Returns all columns as a map, indexed by column name. For each column, rows are indexed by number.

If no column names were found or provided, this method will throw an error.

Return type:: Map

rows¶

Returns all rows as a map, indexed by row number. For each row, columns are indexed by column number.

Return type:: Map

rowsByColName¶

Returns all rows as a map, indexed by row number. For each row, columns are indexed by column name.

If no column names were found or provided, this method will throw an error.

Return type:: Map

class CsvReader¶

rowNumber¶

Returns the number of read rows.

Return type:: int

colNames¶

Returns the columns names. If no column names were found or provided, nil is returned.

Return type:: Map (array of strings) or nil

nextRow()¶

Read the next line of the CSV file. It returns the read row indexed by column numbers, or nil if the end of file is reached.

Return type:: Map or nil

nextRowByColName()¶

Read the next line of the CSV file. It returns the read row indexed by column names, or nil if the end of file is reached.

If no column names were found or provided, this method will throw an error.

Return type:: Map or nil

Options summary¶

Global options¶

The following options apply to the entire CSV file.

Option name	Type	Default value	Description
delimiter	string (length 1)	nil	Character used to delimit columns. When the value is `nil`, the parser will automatically guess the most likely delimiter among {`,`, `;`, `\t`, `\|`}.
decimal	string (length 1)	`.`	Character to recognize as decimal point. If the column delimiter is `;` (guessed or defined) and this parameter is not overrided, the character used will be `,`.
quote	string (length 1)	`"`	Character used to denote the start and end of a quoted item. If the quoted items include column and/or row delimiters, they will be added to the string and their delimiter meaning will not be applied.
escape	string (length 1)	`"`	Character used to escape other characters.
headerRow	int	0	Row number used as the column names, and the start of the data. If column names are specified in the columnOptions, the names found in this row will be overrided. If a negative number is set, no column headers will be parsed from the file and only the names specified in the columnOptions will be used (if present).
skipLines	map (array of ints)	nil	Line numbers to skip (0-indexed).
skipEmptyLines	bool	true	True to ignore the empty lines between the records, false to translate empty lines to empty records. An empty line is a blank line with no fields or a line with only empty fields. Note that if the useDefaultEmpty policy is activated on at least one column, the fields will not be considered empty and thus the line will not be ignored.
trimWhitespace	bool	false	Trim leading and trailing spaces for each string field that is not between quotes.
longLinePolicy	string	ignoreCols	Specifies what to do upon encountering a line with too many fields. Allowed values are : `ignoreCols`: ignore columns; `addMissingCols`: add new columns with nil values to the other rows; `skipLine`: skip all line; `throwError`: throw an error.
shortLinePolicy	string	fillMissingCols	Specifies what to do upon encountering a line with too few fields. Allowed values are : `fillMissingCols`: fill missing columns with values matching the `emptyValuePolicy` of the column; `skipLine`: skip all line; `throwError`: throw an error.
nanValues	map (array of strings)		Values to consider as `nan`. Default is {“#N/A”, “#N/A N/A”, “#NA”, “-NaN”, “-nan”, “<NA>”, “N/A”, “NA”, “NaN”, “n/a”, “nan”}.
infValues	map (array of strings)		Values to consider as `inf`. Default is {“inf”, “Inf”}.
trueValues	map (array of strings)		Values to consider as `true`. Default is {“true”, “True”, “1”}.
falseValues	map (array of strings)		Values to consider as `false`. Default is {“false”, “False”, “0”}.
nilValues	map (array of strings)		Values to consider as `nil`. Default is {“nil”, “null”, “NULL”}.
columnOptions	map	nil	Options for each column, indexed by column number (see below).
internStrings	bool	false	Tells the parser to reuse the same strings rather than creating new duplicated ones. This option reduces the memory consumption of large CSVs but slightly decreases the parsing speed.

For the parser to work properly, the options must meet the following:

The characters used in delimiter, decimal and quote must all be different. In addition, they must not represent a line break. The line breaks supported are LF (\n) and CRLF (\r\n).

If a string is present in one of the arrays representing the values true, false, nan, inf or nil, then it cannot be present in another of these arrays.

The column names must be all different and of type string. If nil is found in the header row, the column name will be created as Unnamed: {column_index}.

Column options¶

The following options are applied per column, and must be specified with the global option columnOptions.

Option name	Type	Default value	Description
name	string	nil	Column name to use. When the value is `nil`, the value is automatically parsed from the header row.
type	string	nil	Type of values expected in the column. Allowed values are “bool”, “int”, “float” and “string”. If the type is `nil`, the parser will automatically guess the type according to the parsed value.
errorValuePolicy	string	setNil	Specifies what to do upon encountering a value that cannot be parsed in the specified type. Allowed values are : `useDefaultError`: use the value specified in the `defaultErrorValue` parameter; `setNil`: set the value as nil; `throwError`: throw an error. Note that this parameter will have no effect if the type of the column is not specified, or if the type is “string”.
defaultErrorValue			Value to be used when the policy `useDefaultError` is activated and an error is encountered. The type of the default value must be the same as the column type. If no overloaded, the following values will be used according to the column type: `double`: NaN; `int`: 0; `bool`: false. Note that this parameter will have no effect if the type of the column is not specified, or if the type is “string”.
emptyValuePolicy	string	setNil	Specifies what to do upon encountering an empty value without quotes. Allowed values are : `useDefaultEmpty`: use the value specified in the `defaultEmptyValue` parameter; `setNil`: set the value as nil; `throwError`: throw an error.
defaultEmptyValue			Value to be used when the policy `useDefaultEmpty` is activated and an empty value is encountered. The type of the default value must be the same as the column type. If no overloaded, the following values will be used according to the column type: `double`: NaN; `int`: 0; `bool`: false; `string`: an empty string (“”); type not specified: `nil`.