The Khedra Book

Khedra (prononced kɛd-ɾɑ) is an all-in-one "long-running" tool for indexing and sharing the Unchained Index and monitoring individual addresses on EVM-compatible blockchains.

The tool creates and shares the Unchained Index which is a permissionless index of "address appearances," including appearances in event logs, execution traces, incoming transactions, modifications to smart contract state, staking or block rewards, prefund allocations and many other locations.

This detailed indexing allows for near-perfect monitoring and notifications of address activity, which leads to many benefits. The benefits include native and ERC-20 account balance histories, address auditing and accounting, and even custom indexing. It works for any address on any chain (as long as you have access to the chain's RPC).

Enjoy!

Please help us improve this software by providing any feedback or suggestions. Contact information and links to our socials are available at our website.

About the Name

The name khedra (prononced kɛd-ɾɑ) is inspired by the Persian word خدمت (khedmat), meaning "service."

In ancient Persian culture, service was considered a noble pursuit, emphasizing dedication, reliability, and humility in action. Drawing from this tradition, the name khedra embodies the essence of a system designed to serve--efficiently, continuously, and with purpose.

Simliar to its counterpart, chifra (derived from the Persian word for "cipher"), the name khedra symbolizes a long-running, dependable processes that tirelessly "serves" the needs of its users.

More technically, khedra is a collection of go routines that:

  • creates and publishes the Unchained Index,
  • monitors a user-provided customized list of addresses automating caching, notifications, and other ETL processes,
  • provides a RESTful API exposing chifra's many data access commands,
  • allows for starting, stopping, pausing, and resuming these individual services.

By choosing the name khedra, we honor a legacy of service while committing to building tools that are as resilient, adaptive, and reliable as the meaning behind its name.

User Manual

Overview of Khedra

Khedra is a blockchain indexing and monitoring application designed to provide users with an efficient way to interact with and manage transactional histories for EVM-compatible blockchains. It supports functionalities such as transaction monitoring, address indexing, publishing and pinning the indexes to IPFS and a smart contract, and a RESTful API for accessing data.

Purpose of this Document

This "User's Manual" is designed to help users get started with Khedra, understand its features, and operate the application effectively for both basic and advanced use cases. For a more technical treatment of the software, refer to the Technical Specification.

Intended Audience

This manual is intended for:

  • End-users looking to index and monitor blockchain data.
  • Developers integrating blockchain data into their applications.
  • System administrators managing blockchain-related infrastructure.

Introduction

What is Khedra?

Khedra (pronounced kɛd-ɾɑ) is an all-in-one blockchain indexing and monitoring solution for EVM-compatible blockchains. It provides a comprehensive suite of tools to index, monitor, serve, and share blockchain data in a local-first, privacy-preserving manner.

At its core, Khedra creates and maintains the Unchained Index - a permissionless index of address appearances across blockchain data, including transactions, event logs, execution traces, and more. This detailed indexing enables powerful monitoring capabilities for any address on any supported chain.

Key Features

1. Comprehensive Indexing

Khedra indexes address appearances from multiple sources:

  • Transactions (senders and recipients)
  • Event logs (topics and data fields)
  • Execution traces (internal calls)
  • Smart contract state changes
  • Block rewards and staking activities
  • Genesis allocations

The resulting index allows for lightning-fast lookups of any address's complete on-chain history.

2. Multi-Chain Support

While Ethereum mainnet is required, Khedra works with any EVM-compatible blockchain, including:

  • Test networks (Sepolia, etc.)
  • Layer 2 solutions (Optimism, Arbitrum)
  • Alternative EVMs (Gnosis Chain, etc.)

Each chain requires only a valid RPC endpoint to begin indexing.

3. Modular Service Architecture

Khedra operates through five interconnected services:

  • Control Service: Central management API
  • Scraper Service: Builds and maintains the Unchained Index
  • Monitor Service: Tracks specific addresses of interest
  • API Service: Provides data access via REST endpoints
  • IPFS Service: Enables distributed sharing of index data

These services can be enabled or disabled independently to suit your needs.

4. Privacy-Preserving Design

Unlike traditional blockchain explorers that track user behavior, Khedra:

  • Runs entirely on your local machine
  • Never sends queries to third-party servers
  • Doesn't track or log your address lookups
  • Gives you complete control over your data

5. Distributed Index Sharing

The Unchained Index can be optionally shared and downloaded via IPFS, creating a collaborative network where:

  • Users can contribute to building different parts of the index
  • New users can download existing index portions instead of rebuilding
  • The index becomes more resilient through distribution

Use Cases

Khedra excels in numerous blockchain data scenarios:

  • Account History: Track complete transaction and interaction history for any address
  • Balance Tracking: Monitor native and ERC-20 token balances over time
  • Smart Contract Monitoring: Watch for specific events or interactions with contracts
  • Auditing and Accounting: Export complete financial histories for tax or business purposes
  • Custom Indexing: Build specialized indices for specific protocols or applications
  • Data Analysis: Extract patterns and insights from comprehensive on-chain data

Getting Started

The following chapters will guide you through:

  1. Installing and configuring Khedra
  2. Understanding the core concepts and architecture
  3. Using the various components and services
  4. Advanced operations and customization
  5. Maintenance and troubleshooting

Whether you're a developer, researcher, trader, or blockchain enthusiast, Khedra provides the tools you need to extract maximum value from blockchain data while maintaining your privacy and autonomy.

Implementation Details

The core features of Khedra described in this introduction are implemented in the following Go files:

Getting Started

Overview

Khedra runs primarily from a configuration file called config.yaml. This file lives at ~/.khedra/config.yaml by default. If the file is not found, Khedra creates a default configuration in this location.

The config file allows you to specify key parameters for running khedra, including which chains to index/monitor, which services to enable, how detailed to log the processes, and where and how to publish (that is, share) the results.

You may use environment variables to override specific options. This document outlines the configuration file structure, validation rules, default values, and environment variable usage.


Quick Start

  1. Download, build, and test khedra:

    git clone https://github.com/TrueBlocks/trueblocks-khedra.git
    cd trueblocks-khedra
    go build -o khedra main.go
    ./khedra version
    

    You should get something similar to khedra v4.0.0-release.

  2. You may edit the config file with:

    ./khedra config edit
    

    Modify the file according to your requirements (see below).

    The minimal configuration needed is to provide a valid RPC to Ethereum mainnet. (All configurations require access to Ethereum mainnet.)

    You may configure as many other EVM-compatible chains (each with its own RPC) as you like.

  3. Use the Wizard:

    You may also use the khedra wizard to create a configuration file. The wizard will prompt you for the required information and generate a config.yaml file.

    ./khedra init
    
  4. Location of the configuration file:

    By default, the config file resides at ~/.khedra/config.yaml. (The folder and the file will be created if it does not exist).

    You may, however, place a config.yaml file in the current working folder (the folder from which you run khedra). If found locally, this configuration file will dominate. This allows for running multiple instances of the software concurrently.

    If no config.yaml file is found, khedra creates a default configuration in its default location.

  5. Using Environment Variables:

    You may override configuration options using environment variables, each of which must take the form TB_KHEDRA_<section>_<key>.

    For example, the following overrides the general.dataFolder value.

    export TB_KHEDRA_GENERAL_DATAFOLDER="/path/override"
    

    You'll notice that underbars (_) in the <key> names are not needed.


Configuration File Format

The config.yaml file (shown here with default values) is structured as follows:

# Khedra Configuration File
# Version: 2.0

general:
  dataFolder: "~/.khedra/data"  # See note 1
  strategy: "download"          # How to build the Unchained Index [download* | scrape]
  detail: "index"               # How detailed to log the processes [index* | blooms]

chains:
  mainnet:                       # Blockchain name (see notes 2, 3, and 4)
    rpcs:                        # A list of RPC endpoints (at least one is required)
      - "rpc_endpoint_for_mainnet"
    enabled: true                # `true` if this chain is enabled
  sepolia:
    rpcs:
      - "rpc_endpoint_for_sepolia"
    enabled: true
  gnosis:                         # Add as many chains as your machine can handle
    rpcs:
      - "rpc_endpoint_for_gnosis" # must be a reachable URL if the chain is enabled
    enabled: false                # in this example, this chain is disabled
  optimism:
    rpcs:
      - "rpc_endpoint_for_optimism"
    enabled: false

services:                          # See note 5
  scraper:                         # Required. (One of: api, scraper, monitor, ipfs, control)
    enabled: true                  # `true` if the service is enabled
    sleep: 12                      # Seconds between scraping batches (see note 6)
    batchSize: 500                 # Number of blocks to process in a batch (range: 50-10000)

  monitor:
    enabled: true
    sleep: 12                      # Seconds between scraping batches (see note 6)
    batchSize: 500                 # Number of blocks processed in a batch (range: 50-10000)

  api:
    enabled: true
    port: 8080                     # Port number for API service (the port must be available)

  ipfs:
    enabled: true
    port: 5001                     # Port number for IPFS service (the port must be available)

  control:
    enabled: true                  # Always enabled - false values are invalid
    port: 5001                     # Port number for IPFS service (the port must be available)

logging:
  folder: "~/.khedra/logs"         # Path to log directory (must exist and be writable)
  filename: "khedra.log"           # Log file name (must end with .log)
  toFile: false                    # If true, will write to above file. Screen only otherwise
  level: "info"                    # One of: debug, info, warn, error
  maxSize: 10                      # Max log file size in MB
  maxBackups: 5                    # Number of backup log files to keep
  maxAge: 30                       # Number of days to retain old logs
  compress: true                   # Whether to compress backup logs

Notes:

  1. The dataFolder value must be a valid, existing directory that is writable. You may wish to change this value to a location with suitable disc space. Depending on configuration, the Unchained Index and binary caches may get large (> 200GB in some cases).

  2. The chains section is required. At least one chain must be enabled. An RPC for mainnet is required even if mainnet is disabled. The software reads mainnet smart contracts (such as the Unchained Index and UniSwap) during normal operation.

  3. This repository is used to identify chain names. Using consistent chain names aides in sharing indexes. Use these values in your configuration if you wish to fully participate in sharing the Unchained Index.

  4. The services section is required. At least one service must be enabled.

  5. When a scraper or monitor is "catching up" to a chain, the sleep value is ignored.


Using Environment Variables

Khedra allows configuration values to be overridden at runtime using environment variables. The value of an environment variable takes precedence over the defaults and the configuration file.

Naming Evirnment Variables

The environment variable naming convention is:

TB_KHEDRA_<section>_<key>

For example:

  • To override the general.dataFolder value:

    export TB_KHEDRA_GENERAL_DATAFOLDER="/path/override"
    
  • To override logging.level:

    export TB_KHEDRA_LOGGING_LEVEL="debug"
    

Underbars (_) in <key> names are not used and should be omitted.

Overriding Chains and Services

Environment variables can also be used to override values for chains and services settings. The naming convention for these sections is as follows:

TB_KHEDRA_<section>_<name>_<key>

Where:

  • <section> is either CHAINS or SERVICES.
  • <name> is the name of the chain or service (converted to uppercase).
  • <key> is the specific field to override.

Examples

To override the RPC endpoints for the mainnet chain:

export TB_KHEDRA_CHAINS_MAINNET_RPCS="http://rpc1.mainnet,http://rpc2.mainnet"

You may list mulitple RPC endpoints by separating them with commas.

To disable the mainnet chain:

export TB_KHEDRA_CHAINS_MAINNET_ENABLED="false"

To enable the api service:

export TB_KHEDRA_SERVICES_API_ENABLED="true"

To set the port for the api service:

export TB_KHEDRA_SERVICES_API_PORT="8088"

Precedence Rules

  1. Default values are loaded first,
  2. Values from config.yaml override the defaults,
  3. Environment variables take precedence over both the defaults and the file.

The values set by environment variables must conform to the same validation rules as the configuration file.


Configuration Sections

General Settings

  • dataFolder: The location where khedra stores all of its data. This directory must exist and be writable.
  • strategy: The strategy used to initialize the Unchained Index. With download (the default), the Unchained Index smart contract will be consulted the index will be downloaded from IPFS. With scrape the entire index will be created from scratch. The former takes a lot less time, but relies on values created by a third party. The later (scrape) uses only the RPC as a source which means it takes significantly longer, but is most secure as no third-party trust is required.
  • detail: The detail level of the dowloaded or scraped index. With index both the Bloom filters and the Index chunks are either downloaded or build (depending on strategy). With blooms, only the Bloom filters are retained. Index chunks are downloaded on an as needed basis through chifra export. The former is much larger and takes much longer to download (if strategy is scrape no time savings is seen). The later is much smaller and faster to download. Downloading or creating the full index is the default.

Chains (Blockchains)

Defines the blockchain networks to interact with. Each chain must have:

  • name: Chain name (e.g., mainnet).
  • rpcs: List of RPC endpoints. At least one valid and reachable endpoint is required. mainnet RPC is required, but you are not required to index it.
  • enabled: Whether the chain is being actively indexed.

Behavior for Empty RPCs

  • If the RPCs field is empty in the environment, it is ignored and the configuration file's value is preserved.
  • If the RPCs field is empty in the final configuration (after merging), the chain is treated as it would be if it were disabled.

Services (API, Scraper, Monitor, IPFS)

Defines various services provided by Khedra. Supported services:

  • API:
    • An API server for the chifra command line interface. See API Documentation for details.
    • Requires port to be specified in the configuration.
  • Scraper and Monitor:
    • These two services are used to scrape and monitor the blockchain data respectively. Each runs "periodically" to keep the index or monitor data up to date.
    • sleep: Duration (seconds) between operations.
    • batchSize: Number of blocks to process in each operation (50-10,000).
  • IPFS:
    • A service for interacting with IPFS (InterPlanetary File System). This service starts an internal IPFS daemon if it's not already running. The scraper service may use IPFS to pin and share the index if so configured.
    • Requires port to be specified.

Logging Configuration

Controls the application's logging behavior:

  • folder: Directory for storing logs.
  • filename: Name of the log file.
  • toFile: If true, logs are written to the specified file. If false, logs are only printed to the console.
  • level: Logging level. Possible values: debug, info, warn, error.
  • maxSize: Maximum log file size before rotation.
  • maxBackups: Number of old log files to retain.
  • maxAge: Retention period for old logs.
  • compress: Whether to compress rotated logs.

Validation Rules

The configuration file and environment variables are validated when the program starts with the following rules:

General

  • dataFolder: Must be a valid, existing directory and writable.
  • strategy: Must be either download or scrape.
  • detail: Must be either index or blooms.

Chains

  • name: Required and non-empty.
  • rpcs: Must include at least one valid and reachable RPC URL.
  • Empty RPC Behavior: Ignored from the environment, but required in the final configuration.
  • enabled: Defaults to false if not specified.

Notes on chains section

  1. The mainnet RPC is required even if indexing the chain is disabled. The software reads mainnet smart contracts (such as the Unchained Index and UniSwap) during normal operation.
  2. It is always best to have a dedicated RPC endpoint. If you are using a public RPC endpoint, be sure to check the rate limits and usage policies of the provider and set the sleep and batchSize values for the services appropriately. Some providers (all providers?) will block or throttle requests if they exceed certain limits.

Services

  • name: Required and non-empty. Must be one of api, scraper, monitor, ipfs.
  • enabled: Defaults to false if not specified.
  • port: For API and IPFS services, must be between 1024 and 65535. Ignored for other services.
  • sleep: Must be non-negative. Ignored by API and IPFS services.
  • batchSize: Must be between 50 and 10,000. Ignored by API and IPFS services.

Logging

  • folder: Must exist and be writable.
  • filename: Must end with .log.
  • toFile: Must be true or false.
  • level: Must be one of debug, info, warn, error.
  • maxSize: Minimum value of 5.
  • maxBackups: Minimum value of 1.
  • maxAge: Minimum value of 1.

Default Values

If the configuration file is not found or incomplete, Khedra uses the following defaults:

  • Data directory: ~/.khedra/data
  • Logging configuration:
    • Folder: ~/.khedra/logs
    • Filename: khedra.log
    • Max size: 10 MB
    • Max backups: 3
    • Max age: 10 days
    • Compression: Enabled
    • Log level: info
  • Chains: Only mainnet and gnosis enabled by default.
  • Services: All services (api, scraper, monitor, ipfs) enabled with default configurations.

Common Commands

  1. Validate Configuration: Khedra validates the config.yaml file and environment variables automatically on startup.

  2. Run Khedra:

    ./khedra --version
    

    Ensure that your config.yaml file is properly set up.

  3. Override Configuration with Environment Variables:

    Use environment variables to override specific configurations:

    export TB_KHEDRA_GENERAL_DATAFOLDER="/new/path"
    ./khedra
    

For additional details, see the technical specification.

Implementation Details

The configuration system and initialization described in this section are implemented in these Go files:

  • Configuration Loading: app/config.go - Contains the LoadConfig() function that loads, merges, and validates configuration from files and environment variables

  • Configuration Validation:

  • Environment Variables: pkg/types/apply_env.go - Contains functions for applying environment variables to the configuration

  • Initialization Command: app/action_init.go - Implements the init command to set up the initial configuration

  • Folder and Path Management: Found in the initializeFolders() function in app/config.go which ensures required directories exist

Understanding Khedra

Core Concepts

The Unchained Index

The foundation of Khedra is the Unchained Index - a specialized data structure that maps blockchain addresses to their appearances in blockchain data. Think of it as a reverse index: while a blockchain explorer lets you look up a transaction and see which addresses were involved, the Unchained Index lets you look up an address and see all transactions where it appears.

The index captures appearances from multiple sources:

  • External Transactions: Direct sends and receives
  • Internal Transactions: Contract-to-contract calls (from traces)
  • Event Logs: Events emitted by smart contracts
  • State Changes: Modifications to contract storage
  • Special Appearances: Block rewards, validators, etc.

What makes this particularly powerful is that the index includes trace-derived appearances - meaning it captures internal contract interactions that normal blockchain explorers miss.

Address Appearances

An "appearance" in Khedra means any instance where an address is referenced in blockchain data. Each appearance record contains:

  • The address that appeared
  • The block number where it appeared
  • The transaction index within that block
  • Additional metadata about the appearance type

These compact records allow Khedra to quickly answer the fundamental question: "Where does this address appear in the blockchain?"

Local-First Architecture

Khedra operates as a "local-first" application, meaning:

  1. All data processing happens on your local machine
  2. Your queries never leave your computer
  3. You maintain complete ownership of your data
  4. The application continues to work without internet access

This approach maximizes privacy and resilience while minimizing dependency on external services.

Distributed Collaboration

While Khedra is local-first, it also embraces distributed collaboration through IPFS integration:

  • The Unchained Index can be shared and downloaded in chunks
  • Users can contribute to different parts of the index
  • New users can bootstrap quickly by downloading existing index portions
  • The system becomes more resilient as more people participate

This creates a hybrid model that preserves privacy while enabling community benefits.

System Architecture

Service Components

Khedra is organized into five core services:

  1. Control Service

    • Central management interface
    • Exposes API endpoints for service control
    • Handles configuration and coordinating other services
  2. Scraper Service

    • Processes blockchain data to build the Unchained Index
    • Extracts address appearances from blocks, transactions, and traces
    • Works in configurable batches with adjustable sleep intervals
  3. Monitor Service

    • Tracks specific addresses of interest
    • Provides notifications for address activities
    • Maintains focused indices for monitored addresses
  4. API Service

    • Exposes data through REST endpoints (defined here: API Docs)
    • Provides query interfaces for the index and monitors
    • Enables integration with other tools and services
  5. IPFS Service

    • Facilitates distributed sharing of index data
    • Handles publishing and retrieving chunks via IPFS
    • Enables collaborative index building

Data Flow

Here's how data flows through the Khedra system:

  1. The Scraper retrieves blockchain data from configured RPC endpoints
  2. Address appearances are extracted and added to the Unchained Index
  3. The Monitor service checks new blocks for appearances of watched addresses
  4. The API service provides query access to the indexed data
  5. Optionally, index chunks are shared via the IPFS service

Directory Structure

Khedra organizes its data with this structure:

~/.khedra/
├── config.yaml       # Main configuration file
├── data/             # Main data directory
│   ├── mainnet/      # Chain-specific data
│   │   ├── cache/    # Binary caches
│   │   ├── monitors/ # Address monitor data
│   │   └── index/    # Unchained Index chunks
│   └── [other-chains]/
└── logs/             # Application logs

The above structure may vary depending on your version and configuration. Each chain has its own subdirectory, allowing Khedra to manage multiple chains simultaneously.

Terminology

To help navigate Khedra effectively, here are key terms you'll encounter:

  • Appearance: Any reference to an address in blockchain data
  • Chunk: A portion of the Unchained Index covering a range of blocks
  • Finalized: Blocks that have reached consensus and won't be reorganized
  • Monitor: A configuration to track specific addresses of interest
  • RPC: Remote Procedure Call - the method for communicating with blockchain nodes
  • Trace: Detailed execution record of a transaction, including internal calls

Understanding these core concepts provides the foundation for effectively using Khedra's capabilities, which we'll explore in the following chapters.

Implementation Details

The core concepts and system architecture described in this chapter are implemented in the following Go files:

The Unchained Index

The Unchained Index implementation is handled primarily by the TrueBlocks-core library, with Khedra providing the service framework. The primary code files for index interactions are:

  • Index Management: The scraper service, implemented in the service framework initialized in app/action_daemon.go

Address Monitoring

The monitoring system for tracking address appearances is implemented in:

Service Components

The five core services are defined and initialized in these files:

  • Service Definitions: pkg/types/service.go defines the Service struct and validation rules
  • Service Initialization: app/action_daemon.go in the daemonAction function initializes each service based on configuration
  • Service Manager: The ServiceManager is created in the daemonAction function to coordinate all services

Directory Structure

The directory structure described in this chapter is established by:

  • Folder Initialization: The initializeFolders function in app/config.go
  • Path Resolution: Path management and expansion functions throughout the codebase handle the directory structure

Using Khedra

This chapter covers the practical aspects of working with Khedra once it's installed and configured.

Understanding Khedra's Command Structure

Khedra provides a streamlined set of commands designed to index, monitor, serve, and share blockchain data:

NAME:
   khedra - A tool to index, monitor, serve, and share blockchain data

USAGE:
   khedra [global options] command [command options]

VERSION:
   v5.1.0

COMMANDS:
   init     Initializes Khedra
   daemon   Runs Khedras services
   config   Manages Khedra configuration
   help, h  Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --help, -h     show help
   --version, -v  print the version

Getting Started with Khedra

Initializing Khedra

Before using Khedra, you need to initialize it. This sets up the necessary data structures and configurations:

khedra init

During initialization, Khedra will:

  • Set up its directory structure
  • Configure initial settings
  • Prepare the system for indexing blockchain data

Managing Configuration

To view or modify Khedra's configuration:

khedra config [show | edit]

The configuration command allows you to:

  • View current settings
  • Update connection parameters
  • Adjust service behaviors
  • Configure chain connections

Running Khedra's Services

To start Khedra's daemon services:

khedra daemon

This command:

  • Starts the indexing service
  • Enables the API server if configured
  • Processes monitored addresses
  • Handles data serving capabilities

You can use various options with the daemon command to customize its behavior. For detailed options:

khedra daemon --help

Common Workflows

Basic Setup

  1. Install Khedra using the installation instructions

  2. Initialize the system:

    khedra init
    
  3. Configure as needed:

    khedra config edit
    
  4. Start the daemon services:

    khedra daemon
    

Checking System Status

You can view the current status of Khedra by examining the daemon process:

curl http://localhost:8338/status | jq
  • Note: The port for the above command defaults to one of 8338, 8337, 8336 or 8335 in that order whichever one is first available. If none of those ports is available, the daemon will not start.

Accessing the Data API

If so configured, when the daemon is running, it provides API endpoints for accessing blockchain data. The default configuration typically serves on:

curl http://localhost:8080/status

See the API documentation for more details on available endpoints and their usage.

Getting Help

Each command provides detailed help information. To access help for any command:

khedra [command] --help

For general help:

khedra --help

Version Information

To check which version of Khedra you're running:

khedra --version

Advanced Usage

For more detailed information about advanced operations and configurations, please refer to the documentation for each specific command:

khedra init --help
khedra daemon --help
khedra config --help

The next chapter covers advanced operations for users who want to maximize Khedra's capabilities.

Implementation Details

The command structure and functionality described in this section are implemented in these Go files:

Core Command Structure

  • CLI Framework: app/cli.go - Defines the top-level command structure using the urfave/cli package

Command Implementations

Helper Functions

Maintenance and Troubleshooting

This chapter covers routine maintenance tasks and solutions to common issues you might encounter when using Khedra.

Routine Maintenance

Regular Updates

To keep Khedra running smoothly, periodically check for and install updates:

# Check current version
khedra version

# Update to the latest version
go get -u github.com/TrueBlocks/trueblocks-khedra/v5

# Rebuild and install
cd <path_for_khedra_github_repo>
git pull --recurse-submodules
go build -o bin/khedra main.go
./bin/khedra version

Log Rotation

Khedra automatically rotates logs based on your configuration, but you should periodically check log usage:

# Check log directory size
du -sh ~/.khedra/logs

# List log files
ls -la ~/.khedra/logs

If logs are consuming too much space, adjust your logging configuration:

logging:
  maxSize: 10      # Maximum size in MB before rotation
  maxBackups: 5    # Number of rotated files to keep
  maxAge: 30       # Days to keep rotated logs
  compress: true   # Compress rotated logs

Index Verification

Periodically verify the integrity of your Unchained Index:

chifra chunks index --check --chain <chain_name>

This checks for any gaps or inconsistencies in the index and reports issues.

Cache Management

You may check on the cache size and prune old caches (by hand) to free up space:

# Check cache size
chifra status --verbose

Troubleshooting

Common Issues and Solutions

Service Won't Start

Symptoms: A service fails to start or immediately stops.

Solutions:

  1. Check the logs for error messages:

    tail -n 100 ~/.khedra/logs/khedra.log
    
  2. Verify the service's port isn't in use by another application:

    lsof -i :<port_number>
    
  3. Ensure the RPC endpoints are accessible:

    chifra status
    
  4. Try starting with verbose logging:

    TB_KHEDRA_LOGGING_LEVEL=debug TB_KHEDRA_LOGGING_TOFILE=true khedra start
    

Slow Indexing

Symptoms: Indexing is progressing much slower than expected.

Solutions:

  1. Check RPC endpoint performance:

    chifra status --diagnose
    
  2. Increase or lower batch size in configuration:

    services:
      scraper:
        batchSize: 1000  # Increase from default
    
  3. Monitor system resources to identify bottlenecks:

    top -c -p $(pgrep khedra)
    
  4. Consider using a faster RPC endpoint or running your own node.

Index Gaps

Symptoms: The index status shows gaps in block coverage.

Solutions:

  1. Identify the missing ranges:

    chifra chunks index --check --chain <chain_name>
    
  2. In very rare cases, you may truncate the index and it will rebuild from that spot. BE CAREFUL -- THIS IS NOT A RECOMMENDED SOLUTION.

    chifra chunks index --truncate <block_number> --chain <chain_name>
    

API Connection Issues

Symptoms: Unable to connect to Khedra's API.

Solutions:

  1. Verify the API service is running:

    curl http://localhost:8080/status
    
  2. Check if the configured port is accessible:

    lsof -i :8080
    
  3. Look for firewall or permission issues:

    sudo lsof -i :8080
    

IPFS Connectivity Problems

Symptoms: Unable to publish or fetch via IPFS.

Solutions:

  1. Check IPFS service status:

    ps -ef | grep ipfs
    
  2. Restart khedra

Log Analysis

Khedra's logs are your best resource for troubleshooting. Here's how to use them effectively:

# View recent log entries
tail -f ~/.khedra/logs/khedra.log

# Search for error messages
grep -i error ~/.khedra/logs/khedra.log

# Find logs related to a specific service
grep "scraper" ~/.khedra/logs/khedra.log

# Find logs related to a specific address
grep "0x742d35Cc6634C0532925a3b844Bc454e4438f44e" ~/.khedra/logs/khedra.log

Getting Help

If you encounter issues you can't resolve:

  1. Check the Khedra GitHub repository for known issues
  2. Search the discussions forum for similar problems
  3. Submit a detailed issue report including:
    • Khedra version (khedra version)
    • Relevant log extracts
    • Steps to reproduce the problem
    • Your configuration (with sensitive data redacted)

Regular maintenance and prompt troubleshooting will keep your Khedra installation running smoothly and efficiently.

Implementation Details

The maintenance and troubleshooting procedures described in this document are implemented in several key files:

Service Management

  • Service Lifecycle Management: app/action_daemon.go - Contains the core service management code that starts, stops, and monitors services
  • Service Health Checks: Service status monitoring is implemented in the daemon action function

RPC Connection Management

Logging System

Error Recovery

The troubleshooting techniques described are supported by robust error handling throughout the codebase, especially in:

  • Service error handling: Found in the daemon action function
  • Validation error reporting: Implemented in the validation framework
  • Index management functions: For identifying and fixing gaps in the index

Wizard Screen Documentation

Introduction

Khedra's configuration wizard provides a streamlined, interactive way to set up your installation. Rather than manually editing the config.yaml file, the wizard walks you through each configuration section with clear explanations and validation.

User Interface Features

The wizard provides several helpful features:

  • Keyboard Navigation: Use arrow keys and shortcuts to navigate
  • Contextual Help: Press 'h' on any screen for detailed documentation
  • Editor Integration: Press 'e' to directly edit configuration files
  • Validation: Input is checked for correctness before proceeding
  • Visual Cues: Consistent layout with clear indicators for navigation options

Using the Wizard

Start the Wizard with:

khedra init

Implementation Details

The configuration wizard described in this document is implemented through a package of Go files in the pkg/wizard directory:

Core Wizard Framework

Wizard Screen Implementations

The specific wizard screens visible in the user interface are implemented in these files:

Integration with Configuration System

The wizard integrates with the configuration system through:

  • Configuration Loading: In the ReloaderFn function passed to the wizard
  • Configuration Validation: Through the validation functions for each input field
  • Configuration Saving: In the final step of the wizard workflow

The wizard framework uses a screen-based approach with forward/backward navigation, real-time validation, and contextual help, exactly as described in this document.

Welcome Screen

Function

┌──────────────────────────────────────────────────────────────────────────────┐
│ ╔═══════════════════════════════════════════════════════╗                    │
│ ║                     KHEDRA WIZARD                     ║                    │
│ ║                                                       ║                    │
│ ║   Index, monitor, serve, and share blockchain data.   ║                    │
│ ╚═══════════════════════════════════════════════════════╝                    │
│                                                                              │
│ Welcome to Khedra, the world's only local-first indexer/monitor for          │
│ EVM blockchains. This wizard will help you configure Khedra. There are       │
│ three groups of settings: General, Services, and Chains.                     │
│                                                                              │
│ Type "q" or "quit" to quit, "b" or "back" to return to a previous screen,    │
│ "h" or "help" to get more information, or "e" to edit the file directly.     │
│                                                                              │
│ Press enter to continue.                                                     │
│                                                                              │
│ Keyboard: [h] Help [q] Quit [b] Back [enter] Continue                        │
└──────────────────────────────────────────────────────────────────────────────┘

Purpose

  • Introduces the wizard to the user
  • Orients the user to the configuration process
  • Provides clear navigation instructions
  • Enter: Proceed to the next screen
  • h/help: Open browser with documentation
  • q/quit: Exit the wizard
  • b/back: Return to previous screen
  • e/edit: Edit configuration file directly

You may directly edit the configuration from any screen by typing e or edit. This will open the configuration file in the user's preferred text editor (defined by the EDITOR environment variable).

The welcome screen serves as the entry point to the configuration process, designed to be approachable while providing clear direction on how to proceed.

General Configuration Screen

┌──────────────────────────────────────────────────────────────────────────────┐
│ General Settings                                                             │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│ The General group of options controls where Khedra stores the Unchained      │
│ Index and its caches. It also helps you choose a download strategy for       │
│ the index and helps you set up Khedra's logging options.                     │
│                                                                              │
│ Choose your folders carefully. The index and logs can get quite large        │
│ depending on the configuration. As always, type "help" to get more           │
│ information.                                                                 │
│                                                                              │
│ You may use $HOME or ~/ in your paths to refer to your home directory.       │
│                                                                              │
│ Press enter to continue.                                                     │
│                                                                              │
│ Keyboard: [h] Help [q] Quit [b] Back [enter] Continue                        │
└──────────────────────────────────────────────────────────────────────────────┘

Purpose

  • Allows users to configure high-level application settings
  • Sets up crucial file paths for data storage
  • Configures logging behavior

Key Features

  • Define the main data folder location with path expansion support
  • Configure index download and update strategies
  • Set up logging preferences for troubleshooting
  • Options for path expansion (supporting $HOME and ~/ notation)
  • Disk space requirement warnings
  • Input validation for directory existence and write permissions

Configuration Options

The General Settings screen presents these key configuration options:

  1. Data Folder: Where Khedra stores all index and cache data

    • Default: ~/.khedra/data
    • Must be a writable location with sufficient disk space
  2. Index Download Strategy:

    • IPFS-first: Prioritize downloading from the distributed network
    • Local-first: Prioritize building the index locally
    • Hybrid: Balance between downloading and local building

Services Configuration Screen

┌──────────────────────────────────────────────────────────────────────────────┐
│ Services Settings                                                            │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│ Khedra provides five services. The first, "control," exposes endpoints to    │
│ control the other four: "scrape", "monitor", "api", and "ipfs".              │
│                                                                              │
│ You may disable/enable any combination of services, but at least one must    │
│ be enabled.                                                                  │
│                                                                              │
│ The next few screens will allow you to configure each service.               │
│                                                                              │
│                                                                              │
│                                                                              │
│ Press enter to continue.                                                     │
│                                                                              │
│ Keyboard: [h] Help [q] Quit [b] Back [enter] Continue                        │
└──────────────────────────────────────────────────────────────────────────────┘

Purpose

  • Enables users to select and configure Khedra's core services
  • Explains the relationship between the services
  • Ensures at least one service is enabled for proper functionality

Available Services

Khedra offers five core services that work together:

  1. Control Service: Management endpoints for the other services

    • Always enabled by default
    • Provides a central API for managing other services
  2. Scraper Service: Builds and maintains the Unchained Index

    • Processes blocks to extract address appearances
    • Configurable batch size and sleep interval
  3. Monitor Service: Tracks specific addresses of interest

    • Provides notifications for address activities
    • Configurable batch size and sleep interval
  4. API Service: REST API for data access

    • Configurable port number
    • Provides endpoints for querying the index and monitors
  5. IPFS Service: Distributed data sharing

    • Enables sharing and downloading index data
    • Configurable port number

Configuration Parameters

For each service, you can configure:

  • Enabled/Disabled: Toggle the service on or off
  • Port numbers: For services that expose network endpoints
  • Batch size: Number of blocks processed in one batch (for scraper/monitor)
  • Sleep interval: Time to wait between batches (for scraper/monitor)
  • Resource limits: Memory and CPU constraints

Chain Settings Screen

┌──────────────────────────────────────────────────────────────────────────────┐
│ Chain Settings                                                               │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│ Khedra works with the Ethereum mainnet chain and any EVM-compatible          │
│ blockchain. Each chain requires at least one RPC endpoint URL and a          │
│ chain name.                                                                  │
│                                                                              │
│ Ethereum mainnet must be configured even if other chains are enabled.        │
│ The format of an RPC endpoint is protocol://host:port. For example:          │
│ http://localhost:8545 or https://mainnet.infura.io/v3/YOUR-PROJECT-ID.       │
│                                                                              │
│ The next few screens will help you configure your chains.                    │
│                                                                              │
│ Press enter to continue.                                                     │
│                                                                              │
│ Keyboard: [h] Help [q] Quit [b] Back [enter] Continue                        │
└──────────────────────────────────────────────────────────────────────────────┘

Purpose

  • Configures blockchain connections for indexing and monitoring
  • Ensures proper RPC endpoint setup for each chain
  • Explains the requirement for Ethereum mainnet

Key Features

  • Multiple chain support with standardized naming
  • RPC endpoint configuration and validation
  • Clear explanation of requirements and format

Chain Configuration

The chains configuration screen guides you through setting up:

  1. Ethereum Mainnet (Required)

    • At least one valid RPC endpoint
    • Used for core functionality and the Unchained Index
  2. Additional EVM Chains (Optional)

    • Sepolia, Gnosis, Optimism, and other EVM-compatible chains
    • Each requires at least one RPC endpoint
    • Enable/disable option for each chain

RPC Endpoint Requirements

For each chain, you must provide:

  • A valid RPC URL in the format protocol://host:port
  • Proper authentication details if required (e.g., Infura project ID)
  • Endpoints with sufficient capabilities for indexing (archive nodes recommended)

THIS TEXT NEEDS TO BE REVIEWED.

Validation Checks

The wizard performs these validations on each RPC endpoint:

  • URL format validation
  • Connection test to verify the endpoint is reachable
  • Chain ID verification to ensure the endpoint matches the selected chain
  • API method support check for required JSON-RPC methods THIS TEXT NEEDS TO BE REVIEWED.

Implementation

The chain configuration uses the Screen struct with specialized validation for RPC endpoints. The wizard prioritizes setting up Ethereum mainnet first, then offers options to configure additional chains as needed.

For each chain, the wizard walks through enabling the chain, configuring RPC endpoints, and validating the connection before proceeding to the next chain.

Summary Screen

┌──────────────────────────────────────────────────────────────────────────────┐
│ Summary                                                                      │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│ Question: Would you like to edit the config by hand?                         │
│ Current:  no                                                                 │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│ Press enter to finish the wizard. ("b"=back, "h"=help)                       │
│                                                                              │
│ Keyboard: [h] Help [q] Quit [b] Back [e] Edit [enter] Finish                 │
└──────────────────────────────────────────────────────────────────────────────┘

Purpose

  • Provides a review of all configured settings
  • Offers a final chance to make adjustments before saving
  • Summarizes the configuration in a clear, readable format

Configuration Summary Display

The summary screen presents the configuration organized by section:

  1. General Settings

    • Data folder location
    • Download strategy
    • Logging configuration
  2. Services Configuration

    • Enabled/disabled status for each service
    • Port numbers and key parameters
    • Resource allocations
  3. Chain Settings

    • Configured blockchains
    • RPC endpoints
    • Chain-specific settings

Final Options

From the summary screen, you can:

  1. Finish: Accept the configuration and write it to the config file
  2. Edit: Open the configuration in a text editor for manual changes
  3. Back: Return to previous screens to make adjustments
  4. Help: Access documentation about configuration options
  5. Quit: Exit without saving changes

When the user chooses to finish, the wizard writes the configuration to ~/.khedra/config.yaml by default, or to an alternative location if specified during the process.

If the user chooses to edit the file directly, the wizard will invoke the system's default editor (or the editor specified in the EDITOR environment variable) and then reload the configuration after editing.

Appendices

Glossary of Terms

  • EVM: Ethereum Virtual Machine, the runtime environment for smart contracts in Ethereum and similar blockchains.
  • RPC: Remote Procedure Call, a protocol allowing the application to communicate with blockchain nodes.
  • Indexing: The process of organizing blockchain data for fast and efficient retrieval.
  • IPFS: InterPlanetary File System, a decentralized storage system for sharing and retrieving data.

Frequently Asked Questions (FAQ)

1. What chains are supported by Khedra?

Khedra supports Ethereum mainnet and other EVM-compatible chains such as Sepolia and Gnosis. Additional chains can be added by configuring the TB_NODE_CHAINS environment variable.

2. Do I need an RPC endpoint for every chain?

Yes, each chain you want to index or interact with requires a valid RPC endpoint specified in the .env file.

3. Can I run Khedra without IPFS?

Yes, IPFS integration is optional and can be enabled or disabled using the --ipfs command-line option.

References and Further Reading

Index

  • Address Monitoring:

    • Documentation: Chapter 4, Section "Monitoring Addresses"
    • Implementation: app/action_daemon.go (Monitor service initialization and MonitorsOptions struct)
  • API Access:

    • Documentation: Chapter 4, Section "Accessing the REST API"
    • Implementation: app/action_daemon.go (API service initialization)
  • Blockchain Indexing:

    • Documentation: Chapter 4, Section "Indexing Blockchains"
    • Implementation: app/action_daemon.go (Scraper service initialization)
  • Chains Configuration:

  • Configuration Management:

  • Glossary: Chapter 7, Section "Glossary of Terms"

  • IPFS Integration:

  • Logging and Debugging:

  • RPC Endpoints:

  • Service Configuration:

  • Troubleshooting:

    • Documentation: Chapter 6, Section "Troubleshooting"
    • Implementation: Error handling throughout the codebase, especially in:
  • Wizard Interface:

    • Documentation: Chapter 6, Section "Installation Wizard"
    • Implementation:

Technical Specification

Purpose of this Document

This document defines the technical architecture, design, and functionalities of Khedra, enabling developers and engineers to understand its internal workings and design principles. For a less technical overview of the application, refer to the User Manual.

Intended Audience

This specification is for:

  • Developers working on Khedra or integrating it into applications.
  • System architects designing systems that use Khedra.
  • Technical professionals looking for a detailed understanding of the system.

Scope and Objectives

The specification covers:

  • High-level architecture.
  • Core functionalities such as blockchain indexing, REST API, and address monitoring.
  • Design principles, including scalability, error handling, and integration with IPFS.
  • Supported chains, RPC requirements, and testing methodologies.

Introduction

Purpose of This Document

This technical specification document provides a comprehensive overview of Khedra's architecture, implementation details, and technical design decisions. It serves as a reference for developers, system architects, and technical stakeholders who need to understand the system's inner workings, extend its functionality, or integrate with it.

System Overview

Khedra is a sophisticated blockchain indexing and monitoring solution designed with a local-first architecture. It creates and maintains the Unchained Index - a permissionless index of address appearances across blockchain data - enabling powerful monitoring capabilities for any address on any supported EVM-compatible chain.

Core Technical Components

  1. Indexing Engine: Processes blockchain data to extract and store address appearances
  2. Service Framework: Manages the lifecycle of modular services (scraper, monitor, API, IPFS, control)
  3. Data Storage Layer: Organizes and persists index data and caches
  4. Configuration System: Manages user preferences and system settings
  5. API Layer: Provides programmatic access to indexed data

Key Design Principles

Khedra's technical design adheres to several foundational principles:

  1. Local-First Processing: All data processing happens on the user's machine, maximizing privacy
  2. Chain Agnosticism: Support for any EVM-compatible blockchain with minimal configuration
  3. Modularity: Clean separation of concerns between services for flexibility and maintainability
  4. Resource Efficiency: Careful management of system resources, especially during indexing
  5. Resilience: Robust error handling and recovery mechanisms
  6. Extensibility: Well-defined interfaces to enable future enhancements

Technology Stack

Khedra is built on a modern technology stack:

  • Go: The primary implementation language, chosen for its performance, concurrency model, and cross-platform support
  • IPFS: For distributed sharing of index data
  • RESTful API: For service integration and data access
  • YAML: For configuration management
  • Structured Logging: For operational monitoring and debugging

Target Audience

This technical specification is intended for:

  • Developers: Contributing to Khedra or building on top of it
  • System Administrators: Deploying and maintaining Khedra instances
  • Technical Architects: Evaluating Khedra for integration with other systems
  • Advanced Users: Seeking a deeper understanding of how Khedra works

Document Structure

The remaining sections of this specification are organized as follows:

  • System Architecture: The high-level structure and components
  • Core Functionalities: Detailed explanations of key features
  • Technical Design: Implementation details and design patterns
  • Supported Chains: Technical requirements and integration details
  • Command-Line Interface: API and usage patterns
  • Performance and Scalability: Benchmarks and optimization strategies
  • Integration Points: APIs and interfaces for external systems
  • Testing and Validation: Approaches to quality assurance
  • Appendices: Technical reference materials

This specification aims to provide a comprehensive understanding of Khedra's technical aspects while serving as a reference for implementation, maintenance, and extension of the system.

System Architecture

Architectural Overview

Khedra employs a modular, service-oriented architecture designed for flexibility, resilience, and extensibility. The system is structured around a central application core that coordinates multiple specialized services, each with distinct responsibilities.

High-Level Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│                      Khedra Application                          │
├─────────┬─────────┬─────────┬─────────┬─────────────────────────┤
│ Control │ Scraper │ Monitor │   API   │         IPFS            │
│ Service │ Service │ Service │ Service │        Service          │
├─────────┴─────────┴─────────┴─────────┴─────────────────────────┤
│                      Configuration Manager                       │
├─────────────────────────────────────────────────────────────────┤
│                          Data Layer                              │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────────────┐ │
│  │ Unchained│  │  Binary  │  │ Monitor  │  │ Chain-Specific   │ │
│  │   Index  │  │  Caches  │  │   Data   │  │     Data         │ │
│  └──────────┘  └──────────┘  └──────────┘  └──────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│                      Blockchain Connectors                       │
└─────────────────────────────────────────────────────────────────┘
             ▲                    ▲                     ▲
             │                    │                     │
 ┌───────────┴──────────┐ ┌──────┴───────┐  ┌──────────┴──────────┐
 │  Ethereum Mainnet    │ │   Testnets   │  │   Other EVM Chains  │
 └──────────────────────┘ └──────────────┘  └─────────────────────┘

Core Components

1. Khedra Application

The main application container that initializes, configures, and manages the lifecycle of all services. It provides:

  • Service registration and coordination
  • Application startup and shutdown sequences
  • Signal handling for graceful termination
  • Global state management
  • Cross-service communication

Implementation: app/khedra.go

2. Service Framework

Khedra implements five primary services:

2.1 Control Service

  • Provides management endpoints for other services
  • Handles service health monitoring
  • Enables runtime reconfiguration
  • Serves as the primary management interface

Implementation: pkg/services/control/service.go

2.2 Scraper Service

  • Processes blockchain data to build the Unchained Index
  • Extracts address appearances from transactions, logs, and traces
  • Manages indexing state and progress tracking
  • Handles retry logic for failed operations
  • Implements batch processing with configurable parameters

Implementation: pkg/services/scraper/service.go

2.3 Monitor Service

  • Tracks specified addresses for on-chain activity
  • Maintains focused indices for monitored addresses
  • Processes real-time blocks for quick notifications
  • Supports flexible notification configurations
  • Manages monitor definitions and states

Implementation: pkg/services/monitor/service.go

2.4 API Service

  • Exposes RESTful endpoints for data access
  • Implements query interfaces for the index and monitors
  • Handles authentication and rate limiting
  • Provides structured data responses in multiple formats
  • Includes Swagger documentation for API endpoints

Implementation: pkg/services/api/service.go

2.5 IPFS Service

  • Manages distributed sharing of index data
  • Handles chunking of index data for efficient distribution
  • Implements publishing and retrieval mechanisms
  • Provides peer discovery and connection management
  • Integrates with the IPFS network protocol

Implementation: pkg/services/ipfs/service.go

3. Configuration Manager

A centralized system for managing application settings, including:

  • Configuration file parsing and validation
  • Environment variable integration
  • Runtime configuration updates
  • Defaults management
  • Chain-specific configuration handling

Implementation: pkg/config/config.go

4. Data Layer

The persistent storage infrastructure for Khedra:

4.1 Unchained Index

  • Core data structure mapping addresses to appearances
  • Optimized for fast lookups and efficient storage
  • Implements chunking for distributed sharing
  • Includes versioning for format compatibility

Implementation: pkg/index/index.go

4.2 Binary Caches

  • Stores raw blockchain data for efficient reprocessing
  • Implements cache invalidation and management
  • Optimizes storage space usage with compression
  • Supports pruning and maintenance operations

Implementation: pkg/cache/cache.go

4.3 Monitor Data

  • Stores monitor definitions and state
  • Tracks monitored address appearances
  • Maintains notification history
  • Implements efficient storage for frequent updates

Implementation: pkg/monitor/data.go

4.4 Chain-Specific Data

  • Segregates data by blockchain
  • Stores chain metadata and state
  • Manages chain-specific configurations
  • Handles chain reorganizations

Implementation: pkg/chains/data.go

5. Blockchain Connectors

The interface layer between Khedra and blockchain nodes:

  • RPC client implementations
  • Connection pooling and management
  • Request rate limiting and backoff strategies
  • Error handling and resilience patterns
  • Chain-specific adaptations

Implementation: pkg/rpc/client.go

Communication Patterns

Khedra employs several communication patterns between components:

  1. Service-to-Service Communication: Structured message passing between services using channels
  2. RPC Communication: JSON-RPC communication with blockchain nodes
  3. REST API: HTTP-based communication for external interfaces
  4. File-Based Storage: Persistent data storage using structured files

Deployment Architecture

Khedra supports multiple deployment models:

  1. Standalone Application: Single-process deployment for individual users
  2. Docker Container: Containerized deployment for managed environments
  3. Distributed Deployment: Multiple instances sharing index data via IPFS

Security Architecture

Security considerations in Khedra's architecture include:

  1. Local-First Processing: Minimizes exposure of query data
  2. API Authentication: Optional key-based authentication for API access
  3. Configuration Protection: Secure handling of RPC credentials
  4. Update Verification: Integrity checks for application updates
  5. Resource Isolation: Service-level resource constraints

The modular design of Khedra allows for individual components to be extended, replaced, or enhanced without affecting the entire system, providing a solid foundation for future development and integration.

Core Functionalities

This section details Khedra's primary technical functionalities, explaining how each core feature is implemented and the technical approaches used.

Blockchain Indexing

The Unchained Index

The Unchained Index is the foundational data structure of Khedra, providing a reverse-lookup capability from addresses to their appearances in blockchain data.

Technical Implementation

The index is implemented as a specialized data structure with these key characteristics:

  1. Bloom Filter Front-End: A probabilistic data structure that quickly determines if an address might appear in a block
  2. Address-to-Appearance Mapping: Maps each address to a list of its appearances
  3. Chunked Storage: Divides the index into manageable chunks (typically 1,000,000 blocks per chunk)
  4. Versioned Format: Includes version metadata to handle format evolution
// Simplified representation of the index structure
type UnchainedIndex struct {
    Version string
    Chunks  map[uint64]*IndexChunk  // Key is chunk ID
}

type IndexChunk struct {
    BloomFilter   *BloomFilter
    Appearances   map[string][]Appearance  // Key is hex address
    StartBlock    uint64
    EndBlock      uint64
    LastUpdated   time.Time
}

type Appearance struct {
    BlockNumber    uint64
    TransactionIndex uint16
    AppearanceType  uint8
    LogIndex        uint16
}

Indexing Process

  1. Block Retrieval: Fetch blocks from the RPC endpoint in configurable batches
  2. Appearance Extraction: Process each block to extract address appearances from:
    • Transaction senders and recipients
    • Log topics and indexed parameters
    • Trace calls and results
    • State changes
  3. Deduplication: Remove duplicate appearances within the same transaction
  4. Storage: Update the appropriate index chunk with the new appearances
  5. Bloom Filter Update: Update the bloom filter for quick future lookups

Performance Optimizations

  • Parallel Processing: Multiple blocks processed concurrently
  • Bloom Filters: Fast negative lookups to avoid unnecessary disk access
  • Binary Encoding: Compact storage format for index data
  • Caching: Frequently accessed index portions kept in memory

Address Monitoring

Monitor Implementation

The monitoring system tracks specific addresses for on-chain activity and provides notifications when activity is detected.

Technical Implementation

Monitors are implemented using these components:

  1. Monitor Registry: Central store of all monitored addresses
  2. Address Index: Fast lookup structure for monitored addresses
  3. Activity Tracker: Records and timestamps address activity
  4. Notification Manager: Handles alert distribution based on configuration
// Simplified monitor implementation
type Monitor struct {
    Address       string
    Description   string
    CreatedAt     time.Time
    LastActivity  time.Time
    Config        MonitorConfig
    ActivityLog   []Activity
}

type MonitorConfig struct {
    NotificationChannels []string
    Filters              *ActivityFilter
    Thresholds           map[string]interface{}
}

type Activity struct {
    BlockNumber      uint64
    TransactionHash  string
    Timestamp        time.Time
    ActivityType     string
    Details          map[string]interface{}
}

Monitoring Process

  1. Registration: Add addresses to the monitor registry
  2. Block Processing: As new blocks are processed, check for monitored addresses
  3. Activity Detection: When a monitored address appears, record the activity
  4. Notification: Based on configuration, send notifications via configured channels
  5. State Update: Update the monitor's state with the new activity

Optimization Approaches

  • Focused Index: Maintain a separate index for just monitored addresses
  • Early Detection: Check monitored addresses early in the processing pipeline
  • Configurable Sensitivity: Allow users to set thresholds for notifications
  • Batched Notifications: Group notifications to prevent excessive alerts

API Service

RESTful Interface

The API service provides HTTP endpoints for querying indexed data and managing Khedra's operations.

Technical Implementation

The API is implemented using these components:

  1. HTTP Server: Handles incoming requests and routing
  2. Route Handlers: Process specific endpoint requests
  3. Authentication Middleware: Optional API key verification
  4. Response Formatter: Structures data in requested format (JSON, CSV, etc.)
  5. Documentation: Auto-generated Swagger documentation
// Simplified API route implementation
type APIRoute struct {
    Path        string
    Method      string
    Handler     http.HandlerFunc
    Description string
    Params      []Parameter
    Responses   map[int]Response
}

// API server initialization
func NewAPIServer(config Config) *APIServer {
    server := &APIServer{
        router: mux.NewRouter(),
        port:   config.Port,
        auth:   config.Auth,
    }
    server.registerRoutes()
    return server
}

API Endpoints

The API provides endpoints in several categories:

  1. Status Endpoints: System and service status information
  2. Index Endpoints: Query the Unchained Index for address appearances
  3. Monitor Endpoints: Manage and query address monitors
  4. Chain Endpoints: Blockchain information and operations
  5. Admin Endpoints: Configuration and management operations

Performance Considerations

  • Connection Pooling: Reuse connections for efficiency
  • Response Caching: Cache frequent queries with appropriate invalidation
  • Pagination: Limit response sizes for large result sets
  • Query Optimization: Efficient translation of API queries to index lookups
  • Rate Limiting: Prevent resource exhaustion from excessive requests

IPFS Integration

Distributed Index Sharing

The IPFS integration enables sharing and retrieving index chunks through the distributed IPFS network.

Technical Implementation

The IPFS functionality is implemented with these components:

  1. IPFS Node: Either embedded or external IPFS node connection
  2. Chunk Manager: Handles breaking the index into shareable chunks
  3. Publishing Logic: Manages uploading chunks to IPFS
  4. Discovery Service: Finds and retrieves chunks from the network
  5. Validation: Verifies the integrity of downloaded chunks
// Simplified IPFS service implementation
type IPFSService struct {
    node        *ipfs.CoreAPI
    chunkManager *ChunkManager
    config      IPFSConfig
}

type ChunkManager struct {
    chunkSize      uint64
    validationFunc func([]byte) bool
    storage        *Storage
}

Distribution Process

  1. Chunking: Divide the index into manageable chunks with metadata
  2. Publishing: Add chunks to IPFS and record their content identifiers (CIDs)
  3. Announcement: Share availability information through the network
  4. Discovery: Find chunks needed by querying the IPFS network
  5. Retrieval: Download needed chunks from peers
  6. Validation: Verify chunk integrity before integration

Optimization Strategies

  • Incremental Updates: Share only changed or new chunks
  • Prioritized Retrieval: Download most useful chunks first
  • Peer Selection: Connect to reliable peers for better performance
  • Background Syncing: Retrieve chunks in the background without blocking
  • Compressed Storage: Minimize bandwidth and storage requirements

Configuration Management

Flexible Configuration System

Khedra's configuration system provides multiple ways to configure the application, with clear precedence rules.

Technical Implementation

The configuration system is implemented with these components:

  1. YAML Parser: Reads the configuration file format
  2. Environment Variable Processor: Overrides from environment variables
  3. Validation Engine: Ensures configuration values are valid
  4. Defaults Manager: Provides sensible defaults where needed
  5. Runtime Updater: Handles configuration changes during operation
// Simplified configuration structure
type Config struct {
    General  GeneralConfig
    Chains   map[string]ChainConfig
    Services map[string]ServiceConfig
    Logging  LoggingConfig
}

// Configuration loading process
func LoadConfig(path string) (*Config, error) {
    config := DefaultConfig()
    
    // Load from file if exists
    if fileExists(path) {
        if err := loadFromFile(path, config); err != nil {
            return nil, err
        }
    }
    
    // Override with environment variables
    applyEnvironmentOverrides(config)
    
    // Validate the final configuration
    if err := validateConfig(config); err != nil {
        return nil, err
    }
    
    return config, nil
}

Configuration Sources

The system processes configuration from these sources, in order of precedence:

  1. Environment Variables: Highest precedence, override all other sources
  2. Configuration File: User-provided settings in YAML format
  3. Default Values: Built-in defaults for unspecified settings

Validation Rules

The configuration system enforces these kinds of validation:

  1. Type Validation: Ensures values have the correct data type
  2. Range Validation: Numeric values within acceptable ranges
  3. Format Validation: Strings matching required patterns (e.g., URLs)
  4. Dependency Validation: Related settings are consistent
  5. Resource Validation: Settings are compatible with available resources

These core functionalities form the technical foundation of Khedra, enabling its primary capabilities while providing the flexibility and performance required for blockchain data processing.

Technical Design

This section details the key technical design decisions, patterns, and implementation approaches used in Khedra.

Code Organization

Khedra follows a modular code organization pattern to promote maintainability and separation of concerns.

Directory Structure

khedra/
├── app/                 // Application core
│   ├── khedra.go        // Main application definition
│   └── commands/        // CLI command implementations
├── cmd/                 // Command line entry points
│   └── khedra/          // Main CLI command
├── pkg/                 // Core packages
│   ├── config/          // Configuration management
│   ├── services/        // Service implementations
│   │   ├── api/         // API service
│   │   ├── control/     // Control service
│   │   ├── ipfs/        // IPFS service
│   │   ├── monitor/     // Monitor service
│   │   └── scraper/     // Scraper service
│   ├── index/           // Unchained Index implementation
│   ├── cache/           // Caching logic
│   ├── chains/          // Chain-specific code
│   ├── rpc/             // RPC client implementations
│   ├── wizard/          // Configuration wizard
│   └── utils/           // Shared utilities
└── main.go              // Application entry point

Package Design Principles

  1. Clear Responsibilities: Each package has a single, well-defined responsibility
  2. Minimal Dependencies: Packages depend only on what they need
  3. Interface-Based Design: Dependencies defined as interfaces, not concrete types
  4. Internal Encapsulation: Implementation details hidden behind public interfaces
  5. Context-Based Operations: Functions accept context for cancellation and timeout

Service Architecture

Khedra implements a service-oriented architecture within a single application.

Service Interface

Each service implements a common interface:

type Service interface {
    // Initialize the service
    Init(ctx context.Context) error
    
    // Start the service
    Start(ctx context.Context) error
    
    // Stop the service
    Stop(ctx context.Context) error
    
    // Return the service name
    Name() string
    
    // Return the service status
    Status() ServiceStatus
    
    // Return service-specific metrics
    Metrics() map[string]interface{}
}

Service Lifecycle

  1. Registration: Services register with the application core
  2. Initialization: Services initialize resources and validate configuration
  3. Starting: Services begin operations in coordinated sequence
  4. Running: Services perform their core functions
  5. Stopping: Services gracefully terminate when requested
  6. Cleanup: Services release resources during application shutdown

Service Coordination

Services coordinate through several mechanisms:

  1. Direct References: Services can hold references to other services when needed
  2. Event Bus: Publish-subscribe pattern for decoupled communication
  3. Shared State: Limited shared state for cross-service information
  4. Context Propagation: Request context flows through service operations

Data Storage Design

Khedra employs a hybrid storage approach for different data types.

Directory Layout

~/.khedra/
├── config.yaml           // Main configuration file
├── data/                 // Main data directory
│   ├── mainnet/          // Chain-specific data
│   │   ├── cache/        // Binary caches
│   │   │   ├── blocks/   // Cached blocks
│   │   │   ├── traces/   // Cached traces
│   │   │   └── receipts/ // Cached receipts
│   │   ├── index/        // Unchained Index chunks
│   │   └── monitors/     // Address monitor data
│   └── [other-chains]/   // Other chain data
└── logs/                 // Application logs

Storage Formats

  1. Index Data: Custom binary format optimized for size and query speed
  2. Cache Data: Compressed binary representation of blockchain data
  3. Monitor Data: Structured JSON for flexibility and human readability
  4. Configuration: YAML for readability and easy editing
  5. Logs: Structured JSON for machine processing and analysis

Storage Persistence Strategy

  1. Atomic Writes: Prevent corruption during unexpected shutdowns
  2. Version Headers: Include format version for backward compatibility
  3. Checksums: Verify data integrity through hash validation
  4. Backup Points: Periodic snapshots for recovery
  5. Incremental Updates: Minimize disk writes for frequently changed data

Error Handling and Resilience

Khedra implements robust error handling to ensure reliability in various failure scenarios.

Error Categories

  1. Transient Errors: Temporary failures that can be retried (network issues, rate limiting)
  2. Persistent Errors: Failures requiring intervention (misconfiguration, permission issues)
  3. Fatal Errors: Unrecoverable errors requiring application restart
  4. Validation Errors: Issues with user input or configuration
  5. Resource Errors: Problems with system resources (disk space, memory)

Resilience Patterns

  1. Retry with Backoff: Exponential backoff for transient failures
  2. Circuit Breakers: Prevent cascading failures when services are unhealthy
  3. Graceful Degradation: Reduce functionality rather than failing completely
  4. Health Checks: Proactive monitoring of dependent services
  5. Recovery Points: Maintain state that allows resuming after failures

Error Reporting

  1. Structured Logging: Detailed error information in structured format
  2. Context Preservation: Include context when errors cross boundaries
  3. Error Wrapping: Maintain error chains without losing information
  4. User-Friendly Messages: Translate technical errors to actionable information
  5. Error Metrics: Track error rates and patterns for analysis

Concurrency Model

Khedra leverages Go's concurrency primitives for efficient parallel processing.

Concurrency Patterns

  1. Worker Pools: Process batches of blocks concurrently with controlled parallelism
  2. Fan-Out/Fan-In: Distribute work to multiple goroutines and collect results
  3. Pipelines: Connect processing stages with channels for streaming data
  4. Context Propagation: Pass cancellation signals through processing chains
  5. Rate Limiting: Control resource usage and external API calls

Resource Management

  1. Connection Pooling: Reuse network connections to blockchain nodes
  2. Goroutine Limiting: Prevent excessive goroutine creation
  3. Memory Budgeting: Control memory usage during large operations
  4. I/O Throttling: Balance disk operations to prevent saturation
  5. Adaptive Concurrency: Adjust parallelism based on system capabilities

Synchronization Techniques

  1. Mutexes: Protect shared data structures from concurrent access
  2. Read/Write Locks: Optimize for read-heavy access patterns
  3. Atomic Operations: Use atomic primitives for simple counters and flags
  4. Channels: Communicate between goroutines and implement synchronization
  5. WaitGroups: Coordinate completion of multiple goroutines

Configuration Wizard

The configuration wizard provides an interactive interface for setting up Khedra.

Wizard Architecture

  1. Screen-Based Flow: Organized as a sequence of screens
  2. Question Framework: Standardized interface for user input
  3. Validation Layer: Real-time validation of user inputs
  4. Navigation System: Forward/backward movement between screens
  5. Help Integration: Contextual help for each configuration option

User Interface Design

  1. Text-Based UI: Terminal-friendly interface with box drawing
  2. Color Coding: Visual cues for different types of information
  3. Navigation Bar: Consistent display of available commands
  4. Progress Indication: Show position in the configuration process
  5. Direct Editing: Option to edit configuration files directly

Implementation Approach

The wizard uses a structured approach to manage screens and user interaction:

type Screen struct {
    Title         string
    Subtitle      string
    Body          string
    Instructions  string
    Replacements  []Replacement
    Questions     []Questioner
    Style         Style
    Current       int
    Wizard        *Wizard
    NavigationBar *NavigationBar
}

type Wizard struct {
    Config    *config.Config
    Screens   []Screen
    Current   int
    History   []int
    // Additional fields for wizard state
}

This design allows for a flexible, extensible configuration process that can adapt to different user needs and configuration scenarios.

Testing Strategy

Khedra employs a comprehensive testing strategy to ensure reliability and correctness.

Testing Levels

  1. Unit Tests: Verify individual functions and components
  2. Integration Tests: Test interaction between components
  3. Service Tests: Validate complete service behavior
  4. End-to-End Tests: Test full application workflows
  5. Performance Tests: Benchmark critical operations

Test Implementation

  1. Mock Objects: Simulate external dependencies
  2. Test Fixtures: Standard data sets for reproducible tests
  3. Property-Based Testing: Generate test cases to find edge cases
  4. Parallel Testing: Run tests concurrently for faster feedback
  5. Coverage Analysis: Track code coverage to identify untested areas

These technical design choices provide Khedra with a solid foundation for reliable, maintainable, and efficient operation across a variety of deployment scenarios and use cases.

Supported Chains

This section details the blockchain networks supported by Khedra, the technical requirements for each, and the implementation approaches for multi-chain support.

Chain Support Architecture

Khedra implements a flexible architecture for supporting multiple EVM-compatible blockchains simultaneously.

Chain Abstraction Layer

At the core of Khedra's multi-chain support is a chain abstraction layer that:

  1. Normalizes differences between chain implementations
  2. Provides a uniform interface for blockchain interactions
  3. Manages chain-specific configurations and behaviors
  4. Isolates chain-specific code from the core application logic
// Simplified Chain interface
type Chain interface {
    // Return the chain name
    Name() string
    
    // Return the chain ID
    ChainID() uint64
    
    // Get RPC client for this chain
    Client() rpc.Client
    
    // Get path to chain-specific data directory
    DataDir() string
    
    // Check if this chain requires special handling for a feature
    SupportsFeature(feature string) bool
    
    // Get chain-specific configuration
    Config() ChainConfig
}

Core Chain Requirements

For a blockchain to be fully supported by Khedra, it must meet these technical requirements:

RPC Support

The chain must provide an Ethereum-compatible JSON-RPC API with these essential methods:

  1. Basic Methods:

    • eth_blockNumber: Get the latest block number
    • eth_getBlockByNumber: Retrieve block data
    • eth_getTransactionReceipt: Get transaction receipts with logs
    • eth_chainId: Return the chain identifier
  2. Trace Support:

    • Either debug_traceTransaction or trace_transaction: Retrieve execution traces
    • Alternatively: trace_block or debug_traceBlockByNumber: Get all traces in a block

Data Structures

The chain must use compatible data structures:

  1. Addresses: 20-byte Ethereum-compatible addresses
  2. Transactions: Compatible transaction format with standard fields
  3. Logs: EVM-compatible event logs
  4. Traces: Call traces in a format compatible with Khedra's processors

Consensus and Finality

The chain should have:

  1. Deterministic Finality: Clear rules for when blocks are considered final
  2. Manageable Reorgs: Limited blockchain reorganizations
  3. Block Time Consistency: Relatively consistent block production times

Ethereum Mainnet

Ethereum mainnet is the primary supported chain and is required even when indexing other chains.

Special Considerations

  1. Block Range: Support for full historical range from genesis
  2. Archive Node: Full archive node required for historical traces
  3. Trace Support: Must support either Geth or Parity trace methods
  4. Size Considerations: Largest data volume among supported chains

Implementation Details

// Ethereum mainnet-specific configuration
type EthereumMainnetChain struct {
    BaseChain
    traceMethod string  // "geth" or "parity" style traces
}

func (c *EthereumMainnetChain) ProcessTraces(traces []interface{}) []Appearance {
    // Mainnet-specific trace processing logic
    // ...
}

EVM-Compatible Chains

Khedra supports a variety of EVM-compatible chains with minimal configuration.

Officially Supported Chains

These chains are officially supported with tested implementations:

  1. Ethereum Testnets:

    • Sepolia
    • Goerli (legacy support)
  2. Layer 2 Networks:

    • Optimism
    • Arbitrum
    • Polygon
  3. EVM Sidechains:

    • Gnosis Chain (formerly xDai)
    • Avalanche C-Chain
    • Binance Smart Chain

Chain Configuration

Each chain is configured with these parameters:

chains:
  mainnet:  # Chain identifier
    rpcs:   # List of RPC endpoints
      - "https://ethereum-rpc.example.com"
    enabled: true  # Whether the chain is active
    trace_support: "geth"  # Trace API style
    # Chain-specific overrides
    scraper:
      batch_size: 500

Chain-Specific Adaptations

Some chains require special handling:

  1. Optimism/Arbitrum: Modified trace processing for rollup architecture
  2. Polygon: Adjusted finality assumptions for PoS consensus
  3. BSC/Avalanche: Faster block times requiring different batch sizing

Chain Detection and Validation

Khedra implements robust chain detection and validation:

Auto-Detection

When connecting to an RPC endpoint:

  1. Query eth_chainId to determine the actual chain
  2. Verify against the configured chain identifier
  3. Detect trace method support through API probing
  4. Identify chain-specific capabilities

Connection Management

For each configured chain:

  1. Connection Pool: Maintain multiple connections for parallel operations
  2. Failover Support: Try alternative endpoints when primary fails
  3. Health Monitoring: Track endpoint reliability and performance
  4. Rate Limiting: Respect provider-specific rate limits

Data Isolation

Khedra maintains strict data isolation between chains:

  1. Chain-Specific Directories: Separate storage locations for each chain
  2. Independent Indices: Each chain has its own Unchained Index
  3. Configuration Isolation: Chain-specific settings don't affect other chains
  4. Parallel Processing: Chains can be processed concurrently

Adding New Chain Support

For adding support for a new EVM-compatible chain:

  1. Configuration: Add the chain definition to config.yaml
  2. Custom Handling: Implement any chain-specific processors if needed
  3. Testing: Verify compatibility with the new chain
  4. Documentation: Update supported chains documentation

Example: Adding a New Chain

// Register a new chain type
func RegisterNewChain() {
    registry.RegisterChain("new-chain", func(config ChainConfig) (Chain, error) {
        return &NewChain{
            BaseChain: NewBaseChain(config),
            // Chain-specific initialization
        }, nil
    })
}

// Implement chain-specific behavior
type NewChain struct {
    BaseChain
    // Chain-specific fields
}

func (c *NewChain) SupportsFeature(feature string) bool {
    // Chain-specific feature support
    switch feature {
    case "trace":
        return true
    case "state_diff":
        return false
    default:
        return c.BaseChain.SupportsFeature(feature)
    }
}

Khedra's flexible chain support architecture allows it to adapt to the evolving ecosystem of EVM-compatible blockchains while maintaining consistent indexing and monitoring capabilities across all supported networks.

Command-Line Interface

Khedra provides a comprehensive command-line interface (CLI) for interacting with the system. This section details the CLI's design, implementation, and available commands.

CLI Architecture

The CLI is built using a hierarchical command structure implemented with the Cobra library, providing a consistent and user-friendly interface.

Design Principles

  1. Consistency: Uniform command structure and option naming
  2. Discoverability: Self-documenting with built-in help
  3. Composability: Commands can be combined in pipelines
  4. Feedback: Clear status and progress information
  5. Automation-Friendly: Structured output for scripting

Implementation Structure

func NewRootCommand() *cobra.Command {
    root := &cobra.Command{
        Use:   "khedra",
        Short: "Khedra is a blockchain indexing and monitoring tool",
        Long:  `A comprehensive tool for indexing, monitoring, and querying EVM blockchains`,
    }
    
    // Add global flags
    root.PersistentFlags().StringVar(&cfgFile, "config", "", "config file path")
    root.PersistentFlags().StringVar(&format, "format", "text", "output format (text, json, csv)")
    
    // Add commands
    root.AddCommand(NewStartCommand())
    root.AddCommand(NewStopCommand())
    root.AddCommand(NewStatusCommand())
    // ... additional commands
    
    return root
}

Command Structure

Khedra's CLI is organized into logical command groups.

Root Command

The base khedra command serves as the entry point and provides global options:

khedra [global options] command [command options] [arguments...]

Global options include:

  • --config: Specify an alternate configuration file path
  • --format: Control output format (text, json, csv)
  • --verbose: Enable verbose output
  • --quiet: Suppress non-error output
  • --chain: Specify the target blockchain (defaults to "mainnet")

Service Management Commands

Commands for controlling Khedra's services:

  • khedra start: Start all or specified services

    • --services=service1,service2: Specify services to start
    • --foreground: Run in the foreground (don't daemonize)
  • khedra stop: Stop all or specified services

    • --services=service1,service2: Specify services to stop
  • khedra restart: Restart all or specified services

    • --services=service1,service2: Specify services to restart
  • khedra status: Show status of all or specified services

    • --services=service1,service2: Specify services to check
    • --verbose: Show detailed status information

Index Management Commands

Commands for managing the Unchained Index:

  • khedra index status: Show index status

    • --show-gaps: Display gaps in the index
    • --analytics: Show index analytics
  • khedra index rebuild: Rebuild portions of the index

    • --start=X: Starting block number
    • --end=Y: Ending block number
  • khedra index verify: Verify index integrity

    • --repair: Attempt to repair issues

Monitor Commands

Commands for managing address monitors:

  • khedra monitor add ADDRESS [ADDRESS...]: Add addresses to monitor

    • --name=NAME: Assign a name to the monitor
    • --notifications=webhook,email: Configure notification methods
  • khedra monitor remove ADDRESS [ADDRESS...]: Remove monitored addresses

    • --force: Remove without confirmation
  • khedra monitor list: List all monitored addresses

    • --details: Show detailed information
  • khedra monitor activity ADDRESS: Show activity for a monitored address

    • --from=X: Starting block number
    • --to=Y: Ending block number
    • --limit=N: Limit number of results

Chain Management Commands

Commands for managing blockchain connections:

  • khedra chains list: List configured chains

    • --enabled-only: Show only enabled chains
  • khedra chains add NAME URL: Add a new chain configuration

    • --enable: Enable the chain after adding
  • khedra chains test NAME: Test connection to a chain

    • --verbose: Show detailed test results

Configuration Commands

Commands for managing Khedra's configuration:

  • khedra config show: Display current configuration

    • --redact: Hide sensitive information
    • --section=SECTION: Show only specified section
  • khedra config edit: Open configuration in an editor

  • khedra config wizard: Run interactive configuration wizard

    • --simple: Run simplified wizard with fewer options

Utility Commands

Additional utility commands:

  • khedra version: Show version information

    • --check-update: Check for updates
  • khedra cache prune: Prune old cache data

    • --older-than=30d: Prune data older than specified period
  • khedra export: Export data for external use

    • --address=ADDR: Export data for specific address
    • --format=csv: Export format
    • --output=FILE: Output file path

Implementation Details

Command Execution Flow

  1. Parsing: The CLI parses command-line arguments and flags
  2. Validation: Command options are validated for correctness
  3. Configuration: The application configuration is loaded
  4. Execution: The command is executed with provided options
  5. Output: Results are formatted according to the specified format

Output Formatting

The CLI supports multiple output formats:

  1. Text: Human-readable formatted text (default)
  2. JSON: Structured JSON for programmatic processing
  3. CSV: Comma-separated values for spreadsheet import

Output formatting is implemented through a formatter interface:

type OutputFormatter interface {
    Format(data interface{}) ([]byte, error)
}

// Implementations for different formats
type TextFormatter struct{}
type JSONFormatter struct{}
type CSVFormatter struct{}

Error Handling

CLI commands follow consistent error handling patterns:

  1. User Errors: Clear messages for incorrect usage
  2. System Errors: Detailed information for system-level issues
  3. Exit Codes: Specific exit codes for different error types

Example error handling in a command:

func runStatusCommand(cmd *cobra.Command, args []string) error {
    services, err := getServicesToCheck(cmd)
    if err != nil {
        return fmt.Errorf("invalid service selection: %w", err)
    }
    
    status, err := app.GetServiceStatus(services)
    if err != nil {
        return fmt.Errorf("failed to get status: %w", err)
    }
    
    formatter := getFormatter(cmd)
    output, err := formatter.Format(status)
    if err != nil {
        return fmt.Errorf("failed to format output: %w", err)
    }
    
    fmt.Fprintln(cmd.OutOrStdout(), string(output))
    return nil
}

Command Autocompletion

The CLI generates shell completion scripts for popular shells:

  • khedra completion bash: Generate Bash completion script
  • khedra completion zsh: Generate Zsh completion script
  • khedra completion fish: Generate Fish completion script
  • khedra completion powershell: Generate PowerShell completion script

Command Documentation

All commands include detailed help information accessible via:

  • khedra --help: General help
  • khedra command --help: Command-specific help
  • khedra command subcommand --help: Subcommand-specific help

This comprehensive CLI design provides users with a powerful and flexible interface for interacting with Khedra's functionality through the command line.

Performance and Scalability

This section details Khedra's performance characteristics, optimization strategies, and scalability considerations.

Performance Benchmarks

Indexing Performance

Typical indexing performance metrics on reference hardware (8-core CPU, 16GB RAM, SSD storage):

ChainBlock Processing RateTrace Processing RateDisk Space per 1M Blocks
Mainnet15-25 blocks/sec5-15 blocks/sec1.5-2.5 GB
Testnets20-30 blocks/sec8-20 blocks/sec0.5-1.5 GB
L2 Chains30-50 blocks/sec10-25 blocks/sec1.0-2.0 GB

Factors affecting indexing performance:

  1. RPC Endpoint Performance: Quality and latency of the blockchain RPC connection
  2. Trace Availability: Whether traces are locally available or require RPC calls
  3. Block Complexity: Number of transactions and traces in each block
  4. Hardware Specifications: CPU cores, memory, and disk I/O capacity
  5. Network Conditions: Bandwidth and latency for remote RPC endpoints

Query Performance

Performance metrics for common queries:

Query TypeResponse Time (cold)Response Time (warm)
Address Appearance Lookup50-200ms10-50ms
Block Range Scan100-500ms20-100ms
Monitor Status Check20-100ms5-20ms
API Status Endpoints5-20ms1-5ms

Factors affecting query performance:

  1. Index Structure: Organization and optimization of the Unchained Index
  2. Memory Cache: Availability of data in memory versus disk access
  3. Query Complexity: Number of addresses and block range size
  4. Hardware Specifications: Particularly memory and disk speed
  5. Concurrent Load: Number of simultaneous queries being processed

Performance Optimization Strategies

Memory Management

Khedra implements several memory optimization techniques:

  1. Bloom Filters: Space-efficient probabilistic data structures to quickly determine if an address might appear in a block
  2. LRU Caching: Least Recently Used caching for frequently accessed data
  3. Memory Pooling: Reuse of allocated memory for similar operations
  4. Batch Processing: Processing multiple items in batches to amortize overhead
  5. Incremental GC: Tuned garbage collection to minimize pause times

Implementation example:

// Bloom filter implementation for quick address lookups
type AppearanceBloomFilter struct {
    filter     *bloom.BloomFilter
    capacity   uint
    errorRate  float64
}

func NewAppearanceBloomFilter(expectedItems uint) *AppearanceBloomFilter {
    return &AppearanceBloomFilter{
        filter:    bloom.NewWithEstimates(uint(expectedItems), 0.01),
        capacity:  expectedItems,
        errorRate: 0.01,
    }
}

func (bf *AppearanceBloomFilter) Add(address []byte) {
    bf.filter.Add(address)
}

func (bf *AppearanceBloomFilter) MayContain(address []byte) bool {
    return bf.filter.Test(address)
}

Disk I/O Optimization

Strategies for optimizing disk operations:

  1. Sequential Writes: Organize write patterns for sequential access where possible
  2. Write Batching: Combine multiple small writes into larger operations
  3. Read-Ahead Buffering: Anticipate and pre-load data likely to be needed
  4. Cache Warming: Proactively load frequently accessed data into memory
  5. Compression: Reduce storage requirements and I/O bandwidth

Example implementation:

// Batched write implementation
type BatchedWriter struct {
    buffer     []byte
    maxSize    int
    flushThreshold int
    target     io.Writer
    mutex      sync.Mutex
}

func (w *BatchedWriter) Write(p []byte) (n int, err error) {
    w.mutex.Lock()
    defer w.mutex.Unlock()
    
    // Add to buffer
    w.buffer = append(w.buffer, p...)
    
    // Flush if threshold reached
    if len(w.buffer) >= w.flushThreshold {
        return w.Flush()
    }
    
    return len(p), nil
}

func (w *BatchedWriter) Flush() (n int, err error) {
    if len(w.buffer) == 0 {
        return 0, nil
    }
    
    n, err = w.target.Write(w.buffer)
    w.buffer = w.buffer[:0] // Clear buffer
    return n, err
}

Concurrency Management

Techniques for efficient parallel processing:

  1. Worker Pools: Fixed-size pools of worker goroutines for controlled parallelism
  2. Pipeline Processing: Multi-stage processing with each stage running concurrently
  3. Batched Distribution: Group work items for efficient parallelization
  4. Backpressure Mechanisms: Prevent resource exhaustion during high load
  5. Adaptive Parallelism: Adjust concurrency based on system load and resources

Example worker pool implementation:

// Worker pool for parallel block processing
type BlockWorkerPool struct {
    workers     int
    queue       chan BlockTask
    results     chan BlockResult
    wg          sync.WaitGroup
    ctx         context.Context
    cancel      context.CancelFunc
}

func NewBlockWorkerPool(workers int) *BlockWorkerPool {
    ctx, cancel := context.WithCancel(context.Background())
    pool := &BlockWorkerPool{
        workers: workers,
        queue:   make(chan BlockTask, workers*2),
        results: make(chan BlockResult, workers*2),
        ctx:     ctx,
        cancel:  cancel,
    }
    
    // Start worker goroutines
    pool.wg.Add(workers)
    for i = 0; i < workers; i++ {
        go pool.worker(i)
    }
    
    return pool
}

func (p *BlockWorkerPool) worker(id int) {
    defer p.wg.Done()
    
    for {
        select {
        case <-p.ctx.Done():
            return
        case task, ok := <-p.queue:
            if !ok {
                return
            }
            
            result := processBlock(task)
            p.results <- result
        }
    }
}

Scalability Considerations

Vertical Scaling

Khedra is designed to efficiently utilize additional resources when available:

  1. CPU Utilization: Automatic adjustment to use available CPU cores
  2. Memory Utilization: Configurable memory limits for caching and processing
  3. Storage Scaling: Support for high-performance storage devices
  4. I/O Optimization: Tuning based on available I/O capacity

Configuration parameters for vertical scaling:

services:
  scraper:
    concurrency: 8         # Number of parallel workers
    memory_limit: "4GB"    # Maximum memory usage
    batch_size: 1000       # Items per processing batch

Horizontal Scaling

While Khedra runs as a single process, it supports distributed operation through:

  1. Multi-Instance Deployment: Running multiple instances focusing on different chains
  2. Shared Index via IPFS: Collaborative building and sharing of the index
  3. Split Processing: Dividing block ranges between instances
  4. API Load Balancing: Distributing API queries across instances

Example multi-instance deployment:

Instance 1: Mainnet indexing
Instance 2: L2 chains indexing
Instance 3: API service
Instance 4: Monitor service

Data Volume Management

Strategies for handling large data volumes:

  1. Selective Indexing: Configure which data types to index (transactions, logs, traces)
  2. Retention Policies: Automatically prune older cache data while preserving the index
  3. Compression: Reduce storage requirements through data compression
  4. Tiered Storage: Move less frequently accessed data to lower-cost storage

Example retention configuration:

cache:
  retention:
    blocks: "30d"      # Keep block data for 30 days
    traces: "15d"      # Keep trace data for 15 days
    receipts: "60d"    # Keep receipt data for 60 days
  compression: true    # Enable data compression

Performance Monitoring

Khedra includes built-in performance monitoring capabilities:

  1. Metrics Collection: Runtime statistics for key operations
  2. Performance Logging: Timing information for critical paths
  3. Resource Monitoring: Tracking of CPU, memory, and disk usage
  4. Bottleneck Detection: Identification of performance limitations

Example metrics available:

{
  "scraper": {
    "blocks_processed": 1520489,
    "blocks_per_second": 18.5,
    "current_block": 18245367,
    "last_processed": "2023-06-15T14:23:45Z",
    "memory_usage_mb": 2458,
    "rpc_calls": 247896,
    "processing_latency_ms": 54
  }
}

Resource Requirements

Recommended system specifications based on usage patterns:

Minimum Requirements

  • CPU: 4 cores
  • Memory: 8GB RAM
  • Storage: 250GB SSD
  • Network: 10Mbps stable connection
  • Supported Workload: Monitoring a few addresses, single chain, limited API usage
  • CPU: 8 cores
  • Memory: 16GB RAM
  • Storage: 1TB NVMe SSD
  • Network: 100Mbps stable connection
  • Supported Workload: Full indexing of mainnet, multiple monitored addresses, moderate API usage

High-Performance Configuration

  • CPU: 16+ cores
  • Memory: 32GB+ RAM
  • Storage: 2TB+ NVMe SSD with high IOPS
  • Network: 1Gbps+ connection
  • Supported Workload: Multiple chains, extensive monitoring, heavy API usage, IPFS participation

These performance optimizations and scalability considerations enable Khedra to handle the demands of blockchain data processing efficiently across a wide range of hardware configurations and usage scenarios.

Integration Points

Integration with External APIs

Khedra exposes data through a REST API, making it compatible with external applications. Example use cases:

  • Fetching transaction details for a given address.
  • Retrieving block information for analysis.

Interfacing with IPFS

Data indexed by Khedra can be pinned to IPFS for decentralized storage:

./khedra --ipfs on

Customizing for Specific Use Cases

Users can tailor the configuration by:

  • Adjusting .env variables to include specific chains and RPC endpoints.
  • Writing custom scripts to query the REST API and process the data.

Testing and Validation

Unit Testing

Unit tests cover:

  • Blockchain indexing logic.
  • Configuration parsing and validation.
  • REST API endpoint functionality.

Run tests with:

go test ./...

Integration Testing

Integration tests ensure all components work together as expected. Tests include:

  • RPC connectivity validation.
  • Multi-chain indexing workflows.

Testing Guidelines for Developers

  1. Use mock RPC endpoints for testing without consuming live resources.
  2. Validate .env configuration in test environments before deployment.
  3. Automate tests with CI/CD pipelines to ensure reliability.

Appendices

Glossary of Technical Terms

  • EVM: Ethereum Virtual Machine, the runtime environment for smart contracts.
  • RPC: Remote Procedure Call, a protocol for interacting with blockchain nodes.
  • IPFS: InterPlanetary File System, a decentralized storage solution.

References and Resources

Index

  • Address Monitoring: Section 3, Core Functionalities
  • API Access: Section 3, Core Functionalities
  • Architecture Overview: Section 2, System Architecture
  • Blockchain Indexing: Section 3, Core Functionalities
  • Configuration Files: Section 4, Technical Design
  • Data Flow: Section 4, Technical Design
  • Error Handling: Section 4, Technical Design
  • Integration Points: Section 8, Integration Points
  • IPFS Integration: Section 3, Core Functionalities; Section 8, Integration Points
  • Logging: Section 4, Technical Design
  • Performance Benchmarks: Section 7, Performance and Scalability
  • REST API: Section 3, Core Functionalities; Section 8, Integration Points
  • RPC Requirements: Section 5, Supported Chains and RPCs
  • Scalability Strategies: Section 7, Performance and Scalability
  • System Components: Section 2, System Architecture
  • Testing Guidelines: Section 9, Testing and Validation