# Tutorial: Structure and Client Interface

This tutorial explains the directory structure generated by `wobbegongify` and how to query the data efficiently.

### 1. The Wobbegong Format

Wobbegong flattens complex Bioconductor objects into simple binary files accompanied by a JSON summary.

- **`summary.json`**: Contains metadata (dimensions, types) and **byte offsets**. This is the map clients use to figure out _where_ data lives.
- **`content`**: A binary file containing compressed chunks of data.
- **`stats`**: (For matrices) A binary file containing pre-calculated statistics like row sums.

When you run `wobbegongify(obj, "dir")`, it creates a structured hierarchy:

```text
my_study/
├── summary.json          # Top-level metadata
├── assays/               # Matrix data
│   ├── 0/
│   │   ├── summary.json
│   │   ├── content
│   │   └── stats
│   └── ...
└── reduced_dimensions/   # Reduced dims (stored as DataFrames)
    ├── 0/
    │   ├── summary.json
    │   └── content
    └── ...
```

### 2. Supported Objects

#### BiocFrame

Saved as a series of compressed columns.

```python
df = BiocFrame({"gene": ["A", "B"], "val": [1, 2]})
wobbegongify(df, "data/df")
```

#### Matrices (Dense & Sparse)

Matrices are saved **row-wise**. This is optimized for genomic viewers that need to show expression of a specific gene across all cells.

- **Dense**: Rows are written sequentially.
- **Sparse**: Values and Indices (delta-encoded) are written for each row.

#### SingleCellExperiment

Recursively converts all supported components:

- `assays` -> Matrices
- `row_data` / `col_data` -> BiocFrames
- `reduced_dims` -> BiocFrames (Column-wise)
- `alternative_experiments` -> Nested SingleCellExperiments

### 3. Client Interface

The `wobbegong.load()` function acts as a factory, returning the appropriate reader object based on the `summary.json`.

**Accessing Matrices:**

```python
mat = wobbegong.load("data/matrix")

# Get expression for the 5th gene
row_vec = mat.get_row(4)

# Get pre-calculated statistics (instant access)
total_counts = mat.get_statistic("row_sum")
```

**Accessing DataFrames:**

```python
df = wobbegong.load("data/metadata")

# Get a specific column
cell_ids = df.get_column("cell_id")
```

Check out the [R package](https://github.com/kanaverse/wobbegong-R) for more details.