Excel File Preparation Guidelines

While Gurubase can interpret different Excel structures, it is still important to format your tables for the best possible extraction and analysis results. Follow these guidelines to ensure your data is structured optimally for processing.

1. Complete Headers and Columns

Always include complete headers and columns for all data sections. Good Example
Bad Example
Missing Price and Stock columns

2. Use Clear Structures

Avoid complex nested structures and multi-tables when possible. Good Example
Bad Example
For Sub-tables: When you have sub-tables or multiple data sections, repeat headers and columns for each section:

3. Clean Empty Rows and Columns

Remove all empty rows and columns to keep files as small as possible. Before Cleaning:
After Cleaning:

4. Split into Smaller Sheets

Divide large datasets into multiple smaller, focused sheets. Example Sheet Structure:
  • Sheet 1: Customer Information
  • Sheet 2: Product Catalog
  • Sheet 3: Sales Transactions
  • Sheet 4: Inventory Levels
Benefits:
  • Faster processing
  • Better organization
  • Easier to maintain
  • Reduced file size

5. Use Table Structure Over Form Structure

Prefer tabular data layout instead of form-based layouts. Good Example (Table Structure)
Bad Example (Form Structure)

6. Proper Nested Structure

When nesting is necessary, ensure flows completely encompass each other and merge headers around grouped content. Good Example (Proper Nested Structure)
Good Example (Proper Nested Structure)
Bad Example
Here is its fixed version:
Key Principles for Nested Structures
  1. Complete Coverage: Each nested level should fully encompass the data below it
  2. Merged Headers: Use merged cells to group related columns under main categories
  3. Consistent Structure: Maintain the same pattern throughout the sheet
  4. Clear Hierarchy: Make the relationship between levels obvious

7. Column Oriented Tables

Gurubase can also handle column oriented excel files. Just make sure you include proper headers above the data cells: Good Example (Proper Nested Structure)

8. File Size Optimization

Keep files as small as possible for better performance.
  • Remove unused worksheets
  • Delete empty rows and columns
  • Use appropriate data types
  • Compress images if present
  • Avoid unnecessary formatting

9. Common Mistakes to Avoid

  1. Missing headers - Always include column headers
  2. Unclear header hierarchy - Make nested header relationships obvious
  3. Inconsistent header spanning - Use merged cells consistently for grouped columns
  4. Mixed data types - Keep consistent formats within columns
  5. Excessive nesting - Prefer flat structures when possible
  6. Large single sheets - Split into multiple focused sheets
  7. Unnecessary formatting - Remove complex styling
  8. Hidden data - Ensure all relevant data is visible
  9. Inconsistent naming - Use clear, consistent naming conventions
  10. Ambiguous header names - Use descriptive, specific header labels
Following these guidelines will significantly improve the quality of your Excel data extraction and analysis results.