Skip to content

Working with XLSX

Comprehensive Excel (.xlsx) file handling with sheet selection, advanced header processing, and high-performance data operations.

The XLSX plugin is a part of the dpkit ecosystem providing these capabilities:

  • loadXlsxTable
  • saveXlsxTable

For example:

import { loadXlsxTable, saveXlsxTable } from "@dpkit/xlsx"
const table = await loadXlsxTable({path: "table.xlsx"})
// the field types will be automatically inferred
// or you can provide a Table Schema
await saveXlsxTable(table, {path: "output.xlsx"})
import { loadXlsxTable } from "@dpkit/xlsx"
// Load a simple XLSX file
const table = await loadXlsxTable({ path: "data.xlsx" })
// Load with custom dialect (specify sheet)
const table = await loadXlsxTable({
path: "data.xlsx",
dialect: {
sheetName: "Sheet2",
header: true
}
})
// Load multiple XLSX files (concatenated)
const table = await loadXlsxTable({
path: ["part1.xlsx", "part2.xlsx", "part3.xlsx"]
})
// Table is a Polars LazyDataFrame
const df = table.collect()
df.describe()
import { saveXlsxTable } from "@dpkit/xlsx"
// Save with default options
await saveXlsxTable(table, { path: "output.xlsx" })
// Save with custom sheet name
await saveXlsxTable(table, {
path: "output.xlsx",
dialect: {
sheetName: "Data"
}
})
import { loadXlsxTable } from "@dpkit/xlsx"
// Select by sheet number (1-indexed)
const table = await loadXlsxTable({
path: "workbook.xlsx",
dialect: {
sheetNumber: 2 // Load second sheet
}
})
// Select by sheet name
const table = await loadXlsxTable({
path: "workbook.xlsx",
dialect: {
sheetName: "Sales Data"
}
})
import { loadXlsxTable } from "@dpkit/xlsx"
// XLSX with multiple header rows:
// Year | 2023 | 2023 | 2024 | 2024
// Quarter | Q1 | Q2 | Q1 | Q2
// Revenue | 100 | 120 | 110 | 130
const table = await loadXlsxTable({
path: "multi-header.xlsx",
dialect: {
headerRows: [1, 2],
headerJoin: "_"
}
})
// Resulting columns: ["Year_Quarter", "2023_Q1", "2023_Q2", "2024_Q1", "2024_Q2"]
import { loadXlsxTable } from "@dpkit/xlsx"
// XLSX with comment rows
const table = await loadXlsxTable({
path: "with-comments.xlsx",
dialect: {
commentRows: [1, 2], // Skip first two rows
header: true
}
})
// Skip rows with comment character
const table = await loadXlsxTable({
path: "data.xlsx",
dialect: {
commentChar: "#" // Skip rows starting with #
}
})
import { loadXlsxTable } from "@dpkit/xlsx"
// Load from URL
const table = await loadXlsxTable({
path: "https://example.com/data.xlsx"
})
// Load multiple remote files
const table = await loadXlsxTable({
path: [
"https://api.example.com/data-2023.xlsx",
"https://api.example.com/data-2024.xlsx"
]
})
import { loadXlsxTable } from "@dpkit/xlsx"
// No header row (use generated column names)
const table = await loadXlsxTable({
path: "data.xlsx",
dialect: {
header: false
}
})
// Columns will be: field1, field2, field3, etc.
// Custom header row offset
const table = await loadXlsxTable({
path: "data.xlsx",
dialect: {
headerRows: [3] // Use third row as header
}
})