> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gurubase.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Excel Extraction Best Practices

> Learn how to prepare Excel files for optimal extraction and analysis

Gurubase can interpret various Excel structures, but proper formatting significantly improves extraction quality. Follow these guidelines for optimal results.

## Quick Reference

<CardGroup cols={3}>
  <Card title="Complete Headers" icon="table-columns">
    Include all column headers
  </Card>

  <Card title="Clear Structure" icon="table-cells">
    Avoid complex nesting
  </Card>

  <Card title="No Empty Rows" icon="broom">
    Remove blank rows/columns
  </Card>

  <Card title="Smaller Sheets" icon="layer-group">
    Split large datasets
  </Card>

  <Card title="Table Layout" icon="table">
    Use tables, not forms
  </Card>

  <Card title="Optimize Size" icon="compress">
    Keep files small
  </Card>
</CardGroup>

***

## Structure Guidelines

### 1. Complete Headers and Columns

Always include complete headers and columns for all data sections.

✅ **Good Example**

<Frame>
  <img src="https://mintcdn.com/gurubase/PMnPe4PheUTq6epp/images/guides/excel-extraction/1-good.png?fit=max&auto=format&n=PMnPe4PheUTq6epp&q=85&s=b655f23ccbd30fb161abe9817b84c29d" alt="" width="936" height="158" data-path="images/guides/excel-extraction/1-good.png" />
</Frame>

❌ **Bad Example**

<Frame>
  <img src="https://mintcdn.com/gurubase/PMnPe4PheUTq6epp/images/guides/excel-extraction/1-bad.png?fit=max&auto=format&n=PMnPe4PheUTq6epp&q=85&s=3b43900aee628a2ceebfe345c05d9511" alt="" width="936" height="150" data-path="images/guides/excel-extraction/1-bad.png" />
</Frame>

*Missing Price and Stock columns*

### 2. Use Clear Structures

Avoid complex nested structures and multi-tables when possible.

✅ **Good Example**

<Frame>
  <img src="https://mintcdn.com/gurubase/PMnPe4PheUTq6epp/images/guides/excel-extraction/2-good.png?fit=max&auto=format&n=PMnPe4PheUTq6epp&q=85&s=18821b3e0d9472e5819a1eb98a04e5bc" alt="" width="1044" height="166" data-path="images/guides/excel-extraction/2-good.png" />
</Frame>

❌ **Bad Example**

<Frame>
  <img src="https://mintcdn.com/gurubase/PMnPe4PheUTq6epp/images/guides/excel-extraction/2-bad.png?fit=max&auto=format&n=PMnPe4PheUTq6epp&q=85&s=fb03a1f6d0878eec8566dfd601bd583c" alt="" width="1056" height="162" data-path="images/guides/excel-extraction/2-bad.png" />
</Frame>

**For Sub-tables**:

When you have sub-tables or multiple data sections, repeat headers and columns for each section:

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/2-sub-table.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=e24a0bc6c0f1a06cc001892b47989f99" alt="" width="1314" height="472" data-path="images/guides/excel-extraction/2-sub-table.png" />
</Frame>

### 3. Clean Empty Rows and Columns

Remove all empty rows and columns to keep files as small as possible.

**Before Cleaning:**

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/3-bad.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=ee0c841d1ae95ae2b332b85242bc04c0" alt="" width="1032" height="272" data-path="images/guides/excel-extraction/3-bad.png" />
</Frame>

**After Cleaning:**

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/3-good.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=6e37166e0c09fb7976808a0313009c3f" alt="" width="624" height="164" data-path="images/guides/excel-extraction/3-good.png" />
</Frame>

### 4. Split into Smaller Sheets

Divide large datasets into multiple smaller, focused sheets.

| Sheet   | Content              |
| ------- | -------------------- |
| Sheet 1 | Customer Information |
| Sheet 2 | Product Catalog      |
| Sheet 3 | Sales Transactions   |
| Sheet 4 | Inventory Levels     |

This improves processing speed, organization, and reduces file size.

### 5. Use Table Structure Over Form Structure

Prefer tabular data layout instead of form-based layouts.

✅ **Good Example (Table Structure)**

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/5-good.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=3a50f4b190db8a19834f3b0055482fbc" alt="" width="710" height="306" data-path="images/guides/excel-extraction/5-good.png" />
</Frame>

❌ **Bad Example (Form Structure)**

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/5-bad.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=e6c112758941b8928d37da97cdaa29a7" alt="" width="712" height="248" data-path="images/guides/excel-extraction/5-bad.png" />
</Frame>

***

## Advanced Formatting

### 6. Proper Nested Structure

When nesting is necessary, ensure flows completely encompass each other and merge headers around grouped content.

✅ **Good Example (Proper Nested Structure)**

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/6-good.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=8f2ab4602c3513694c9336f76b90cb29" alt="" width="918" height="358" data-path="images/guides/excel-extraction/6-good.png" />
</Frame>

✅ **Good Example (Proper Nested Structure)**

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/6-good-more-complex.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=7f8dd6fac40c06b24239fb70ff63e07b" alt="" width="1354" height="248" data-path="images/guides/excel-extraction/6-good-more-complex.png" />
</Frame>

❌ **Bad Example**

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/6-bad.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=3662e62b3ad4505f28a8e13f47b3b3fb" alt="" width="1130" height="332" data-path="images/guides/excel-extraction/6-bad.png" />
</Frame>

Here is its fixed version:

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/6-bad-fixed.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=022696e5509621a80847b566f922340c" alt="" width="1128" height="420" data-path="images/guides/excel-extraction/6-bad-fixed.png" />
</Frame>

**Key Principles for Nested Structures**

1. **Complete Coverage**: Each nested level should fully encompass the data below it
2. **Merged Headers**: Use merged cells to group related columns under main categories
3. **Consistent Structure**: Maintain the same pattern throughout the sheet
4. **Clear Hierarchy**: Make the relationship between levels obvious

### 7. Column Oriented Tables

Gurubase can also handle column-oriented Excel files. Just make sure you include proper headers above the data cells:

✅ **Good Example (Proper Nested Structure)**

<Frame>
  <img src="https://mintcdn.com/gurubase/jrPn0pcDQp1Ed5oM/images/guides/excel-extraction/7-good.png?fit=max&auto=format&n=jrPn0pcDQp1Ed5oM&q=85&s=fc4588a6a3a0d12dcba180139a17ba2c" alt="" width="300" data-path="images/guides/excel-extraction/7-good.png" />
</Frame>

***

## Optimization

### 8. File Size Optimization

Keep files as small as possible for better performance:

| Action                        | Benefit            |
| ----------------------------- | ------------------ |
| Remove unused worksheets      | Reduces file size  |
| Delete empty rows/columns     | Faster processing  |
| Use appropriate data types    | Better accuracy    |
| Compress images               | Smaller uploads    |
| Remove unnecessary formatting | Cleaner extraction |

### 9. Common Mistakes to Avoid

<AccordionGroup>
  <Accordion title="Structure Issues" icon="table">
    * **Missing headers** - Always include column headers
    * **Unclear header hierarchy** - Make nested header relationships obvious
    * **Inconsistent header spanning** - Use merged cells consistently for grouped columns
    * **Excessive nesting** - Prefer flat structures when possible
  </Accordion>

  <Accordion title="Data Issues" icon="database">
    * **Mixed data types** - Keep consistent formats within columns
    * **Hidden data** - Ensure all relevant data is visible
    * **Ambiguous header names** - Use descriptive, specific header labels
  </Accordion>

  <Accordion title="File Issues" icon="file-excel">
    * **Large single sheets** - Split into multiple focused sheets
    * **Unnecessary formatting** - Remove complex styling
    * **Inconsistent naming** - Use clear, consistent naming conventions
  </Accordion>
</AccordionGroup>

***

## Next Steps

<CardGroup cols={2}>
  <Card title="Data Sources" icon="database" href="/guides/data-sources">
    Learn about all supported data sources
  </Card>

  <Card title="Create Your First Guru" icon="wand-magic-sparkles" href="/guides/create-guru">
    Build your AI assistant
  </Card>
</CardGroup>
