The Khedra Book
Khedra (prononced kɛd-ɾɑ) is an all-in-one "long-running" tool for indexing and sharing the Unchained Index and monitoring individual addresses on EVM-compatible blockchains.
The tool creates and shares the Unchained Index which is a permissionless index of "address appearances," including appearances in event logs, execution traces, incoming transactions, modifications to smart contract state, staking or block rewards, prefund allocations and many other locations.
This detailed indexing allows for near-perfect monitoring and notifications of address activity, which leads to many benefits. The benefits include native and ERC-20 account balance histories, address auditing and accounting, and even custom indexing. It works for any address on any chain (as long as you have access to the chain's RPC).
Enjoy!
Please help us improve this software by providing any feedback or suggestions. Contact information and links to our socials are available at our website.
About the Name
The name khedra (prononced kɛd-ɾɑ) is inspired by the Persian word خدمت (khedmat), meaning "service."
In ancient Persian culture, service was considered a noble pursuit, emphasizing dedication, reliability, and humility in action. Drawing from this tradition, the name khedra embodies the essence of a system designed to serve--efficiently, continuously, and with purpose.
Simliar to its counterpart, chifra (derived from the Persian word for "cipher"), the name khedra symbolizes a long-running, dependable processes that tirelessly "serves" the needs of its users.
More technically, khedra is a collection of go routines that:
- creates and publishes the Unchained Index,
- monitors a user-provided customized list of addresses automating caching, notifications, and other ETL processes,
- provides a RESTful API exposing chifra's many data access commands,
- allows for starting, stopping, pausing, and resuming these individual services.
By choosing the name khedra, we honor a legacy of service while committing to building tools that are as resilient, adaptive, and reliable as the meaning behind its name.
User Manual
Overview of Khedra
Khedra is a blockchain indexing and monitoring application designed to provide users with an efficient way to interact with and manage transactional histories for EVM-compatible blockchains. It supports functionalities such as transaction monitoring, address indexing, publishing and pinning the indexes to IPFS and a smart contract, and a RESTful API for accessing data.
Purpose of this Document
This "User's Manual" is designed to help users get started with Khedra, understand its features, and operate the application effectively for both basic and advanced use cases. For a more technical treatment of the software, refer to the Technical Specification.
Intended Audience
This manual is intended for:
- End-users looking to index and monitor blockchain data.
- Developers integrating blockchain data into their applications.
- System administrators managing blockchain-related infrastructure.
Introduction
Blockchains are long running-processes that continually create new data (in the form of blocks). For this reason, any process that wishes to monitor, index, or access data from a blockchain must also be long running.
Khedra is such a long-running process.
In order to remain decentralized and permissionless, blockchains must be "freed" from the stranglehold of large data providers. One way to do that is to help people run blockchain nodes locally. However, as soon as one does that, one learns that blockchains are not very good databases. This is for a simple reason, they lack an index.
TrueBlocks Core (of which chifra and khedra are a part) is a set of command-line tools, SDKs, and packages that help users who are running their own blockchain nodes make better use of the data. Khedra indexes and monitors the data. Chifra helps access the data providing various useful commands for exporting, filtering, and processing on-chain activity.
Of primary importance in the design of both systems are:
- speed - we cache nearly everything
- permisionless access - no servers, no API keys, you run your own infrastructure
- accuracy - the goal is 100% off-chain reconciliation of account balances and state history
- depth of detail - required to enable 100% accurate reconciliations
- ease of use - so shoot us - this one is hard
Enjoy!
Please help us improve this software by providing any feedback or suggestions. Contact information and links to our socials are available at our website.
Getting Started
Overview
Khedra runs primarily from a configuration file called config.yaml
. This file lives at ~/.khedra/config.yaml
by default. If the file is not found, Khedra creates a default configuration in this location.
The config file allows you to specify key parameters for running khedra, including which chains to index/monitor, which services to enable, how detailed to log the processes, and where and how to publish (that is, share) the results.
You may use environment variables to override specific options. This document outlines the configuration file structure, validation rules, default values, and environment variable usage.
Quick Start
-
Download, build, and test khedra:
git clone https://github.com/TrueBlocks/trueblocks-khedra.git cd trueblocks-khedra go build -o khedra main.go ./khedra version
You should get something similar to
khedra v4.0.0-release
. -
Establish the config file and edit values for your system:
mkdir -p ~/.khedra cp config.yaml.example ~/.khedra/config.yaml ./khedra config edit
Modify the file according to your requirements (see below).
The minimal configuration needed is to provide a valid RPC to Ethereum mainnet. (All configurations require access to Ethereum
mainnet
.)You may configure as many other EVM-compatible chains (each with its own RPC) as you like.
-
Location of the configuration file:
By default, the config file resides at
~/.khedra/config.yaml
. (The folder and the file will be created if it does not exist).You may, however, place a
config.yaml
file in the current working folder (the folder from which you run khedra). If found locally, this configuration file will dominate. This allows for running multiple instances of the software concurrently.If no
config.yaml
file is found, khedra creates a default configuration in its default location. -
Using Environment Variables:
You may override configuration options using environment variables, each of which must take the form
TB_KHEDRA_<section>_<key>
.For example, the following overrides the
general.dataFolder
value.export TB_KHEDRA_GENERAL_DATAFOLDER="/path/override"
You'll notice that underbars (
_
) in the<key>
names are not needed.
Configuration File Format
The config.yaml
file (shown here with default values) is structured as follows:
# Khedra Configuration File
# Version: 2.0
general:
dataFolder: "~/.khedra/data" # See note 1
chains:
mainnet: # Blockchain name (see notes 2, 3, and 4)
rpcs: # A list of RPC endpoints (at least one is required)
- "rpc_endpoint_for_mainnet"
enabled: true # `true` if this chain is enabled
sepolia:
rpcs:
- "rpc_endpoint_for_sepolia"
enabled: true
gnosis: # Add as many chains as your machine can handle
rpcs:
- "rpc_endpoint_for_gnosis" # must be a reachable, valid URL if the chain is enabled
enabled: false # this chain is disabled
optimism:
rpcs:
- "rpc_endpoint_for_optimism"
enabled: false
services: # See note 5
scraper: # Required. (One of: api, scraper, monitor, ipfs, control)
enabled: true # `true` if the service is enabled
sleep: 12 # Seconds between scraping batches (see note 6)
batchSize: 500 # Number of blocks to process in a batch (range: 50-10000)
monitor:
enabled: true
sleep: 12 # Seconds between scraping batches (see note 6)
batchSize: 500 # Number of blocks processed in a batch (range: 50-10000)
api:
enabled: true
port: 8080 # Port number for API service (the port must be available)
ipfs:
enabled: true
port: 5001 # Port number for IPFS service (the port must be available)
control:
enabled: true # Always enabled - false values are invalid
port: 5001 # Port number for IPFS service (the port must be available)
logging:
folder: "~/.khedra/logs" # Path to log directory (must exist and be writable)
filename: "khedra.log" # Log file name (must end with .log)
level: "info" # One of: debug, info, warn, error
maxSize: 10 # Max log file size in MB
maxBackups: 5 # Number of backup log files to keep
maxAge: 30 # Number of days to retain old logs
compress: true # Whether to compress backup logs
Notes:
-
The
dataFolder
value must be a valid, existing directory that is writable. You may wish to change this value to a location with suitable disc scape. Depending on configuration, the Unchained Index and binary caches may approach 200GB. -
The
chains
section is required. At least one chain must be enabled. -
If chains other than Ethereum
mainnet
are configured, you must also configure Ethereummainnet
. The software readsmainnet
smart contracts (such as the Unchained Index and UniSwap) during normal operation. -
We've used this repository to identify chain names. Using consistent chain names aides in sharing indexes. Use these values in your configuration if you wish to fully participate in sharing the Unchained Index.
-
The
services
section is required. At least one service must be enabled. -
When a
scraper
ormonitor
is "catching up" to a chain, thesleep
value is ignored.
Using Environment Variables
Khedra allows configuration values to be overridden at runtime using environment variables. The value of an environment variable takes precedence over the defaults and the configuration file.
The environment variable naming convention is:
TB_KHEDRA_<section>_<key>
For example:
-
To override the
general.dataFolder
value:export TB_KHEDRA_GENERAL_DATAFOLDER="/path/override"
-
To override
logging.level
:export TB_KHEDRA_LOGGING_LEVEL="debug"
-
To override
services[0].batchSize
:export TB_KHEDRA_LOGGING_BATCHSIZE="100"
Underbars (_
) in <key>
names are not used and should be omitted.
Overriding Chains and Services
Environment variables can also be used to override values for chains and services settings. The naming convention for these sections is as follows:
TB_KHEDRA_<section>_<name>_<key>
Where:
<section>
is eitherCHAINS
orSERVICES
.<name>
is the name of the chain or service (converted to uppercase).<key>
is the specific field to override.
Examples
To override the RPC endpoints for the mainnet
chain:
export TB_KHEDRA_CHAINS_MAINNET_RPCS="http://rpc1.mainnet,http://rpc2.mainnet"
You may list mulitple RPC endpoints by separating them with commas.
To disable the mainnet
chain:
export TB_KHEDRA_CHAINS_MAINNET_ENABLED="false"
To enable the api
service:
export TB_KHEDRA_SERVICES_API_ENABLED="true"
To set the port for the api
service:
export TB_KHEDRA_SERVICES_API_PORT="8088"
Precedence Rules
- Default values are loaded first,
- Values from
config.yaml
override the defaults, - Environment variables take precedence over both the defaults and the file.
The values set by environment variables must conform to the same validation rules as the configuration file.
Configuration Sections
General Settings
dataFolder
: The location where khedra stores all of its data. This directory must exist and be writable.
Chains (Blockchains)
Defines the blockchain networks to interact with. Each chain must have:
name
: Chain name (e.g.,mainnet
).rpcs
: List of RPC endpoints. At least one valid and reachable endpoint is required.enabled
: Whether the chain is active.
Behavior for Empty RPCs
- If the
RPCs
field is empty in the environment, it is ignored and the configuration file's value is preserved. - If the
RPCs
field is empty in the final configuration (after merging), the configuration will be rejected.
Services (API, Scraper, Monitor, IPFS)
Defines various services provided by Khedra. Supported services:
- API:
- Requires
port
to be specified.
- Requires
- Scraper and Monitor:
sleep
: Duration (seconds) between operations.batchSize
: Number of blocks to process in each operation (50-10,000).
- IPFS:
- Requires
port
to be specified.
- Requires
Logging Configuration
Controls the application's logging behavior:
folder
: Directory for storing logs.filename
: Name of the log file.level
: Logging level. Possible values:debug
,info
,warn
,error
.maxSize
: Maximum log file size before rotation.maxBackups
: Number of old log files to retain.maxAge
: Retention period for old logs.compress
: Whether to compress rotated logs.
Validation Rules
The configuration file and environment variables are validated on load with the following rules:
General
dataFolder
: Must be a valid, existing directory and writable.
Chains
name
: Required and non-empty.rpcs
: Must include at least one valid and reachable RPC URL.- Empty RPC Behavior: Ignored from the environment, but required in the final configuration.
enabled
: Defaults tofalse
if not specified.
Services
name
: Required and non-empty. Must be one ofapi
,scraper
,monitor
,ipfs
.enabled
: Defaults tofalse
if not specified.port
: For API and IPFS services, must be between 1024 and 65535.sleep
: Must be non-negative.batchSize
: Must be between 50 and 10,000.
Logging
folder
: Must exist and be writable.filename
: Must end with.log
.level
: Must be one ofdebug
,info
,warn
,error
.maxSize
: Minimum value of 5.maxBackups
: Minimum value of 1.maxAge
: Minimum value of 1.
Default Values
If the configuration file is not found or incomplete, Khedra uses the following defaults:
- Data directory:
~/.khedra/data
- Logging configuration:
- Folder:
~/.khedra/logs
- Filename:
khedra.log
- Max size: 10 MB
- Max backups: 3
- Max age: 10 days
- Compression: Enabled
- Log level:
info
- Folder:
- Chains: Only
mainnet
andsepolia
enabled by default. - Services: All services (
api
,scraper
,monitor
,ipfs
) enabled with default configurations.
Common Commands
-
Validate Configuration: Khedra validates the
config.yaml
file and environment variables automatically on startup. -
Run Khedra:
./khedra --version
Ensure that your
config.yaml
file is properly set up. -
Override Configuration with Environment Variables:
Use environment variables to override specific configurations:
export TB_KHEDRA_GENERAL_DATAFOLDER="/new/path" ./khedra
For additional details, see the technical specification.
Understanding Khedra
Key Features
- Blockchain Indexing: Active indexing of EVM-compatible chains.
- REST API: Expose blockchain data and
chifra
commands via a RESTful interface. - Address Monitoring: Track specific blockchain addresses for transactions.
- IPFS Integration: Pin indexed data to IPFS for decentralized storage.
Application Interface Overview
Khedra operates through:
- Command-Line Interface (Cli): For configuration and command execution.
- REST API: For programmatic interaction with indexed data.
Terminology and Concepts
- Unchained Index: An index of blockchain data optimized for querying.
- Chains: EVM-compatible blockchains (e.g., Ethereum mainnet, Sepolia).
- Providers: RPC endpoints for interacting with blockchains.
Using Khedra
Indexing Blockchains
To index a blockchain, ensure the required environment variables are set for your RPC endpoints, then run:
./khedra --init all --scrape on
This will initialize the blockchain index and start the scraping process.
Accessing the REST API
Enable the REST API by running the application with:
./khedra --api on
Access the API through the default endpoint:
curl http://localhost:8080
Refer to the API documentation for available endpoints and usage.
Monitoring Addresses
You can monitor specific blockchain addresses for transactions. Configure the monitored addresses in your .env
file or through the API, and enable monitoring:
./trueblocks-node --monitor on
Managing Configurations
Khedra configurations can be managed using the .env
file. Changes to the .env
file require a restart of the application to take effect.
Advanced Operations
Integrating with IPFS
Enable IPFS support with:
./khedra --ipfs on
This will pin indexed blockchain data to IPFS, ensuring decentralized storage and retrieval.
Customizing Chain Indexing
Specify additional chains by updating the TB_NODE_CHAINS
environment variable. Example:
TB_NODE_CHAINS="mainnet,sepolia,gnosis"
Ensure each chain has a valid RPC endpoint configured.
Utilizing Command-Line Options
Key options include:
--init [all|blooms|none]
: Specify the type of index initialization.--scrape [on|off]
: Enable or disable the scraper.--api [on|off]
: Enable or disable the API.--sleep [int]
: Set the sleep duration between updates in seconds.
Maintenance and Troubleshooting
Updating Khedra
To update the application, pull the latest changes from the repository and rebuild the binary:
git pull
go build -o khedra .
Common Issues and Solutions
- Missing RPC Provider: Ensure your
.env
file contains valid RPC URLs. - Configuration Errors: Use
--help
to validate command-line arguments.
Log Files and Debugging
Logs are written to the standard output by default. Set the log level in the .env
file:
TB_KHEDRA_LOGGING_LEVEL="Debug"
Contacting Support
If you encounter issues not covered in this guide, contact support at: TrueBlocks Support
Appendices
Glossary of Terms
- EVM: Ethereum Virtual Machine, the runtime environment for smart contracts in Ethereum and similar blockchains.
- RPC: Remote Procedure Call, a protocol allowing the application to communicate with blockchain nodes.
- Indexing: The process of organizing blockchain data for fast and efficient retrieval.
- IPFS: InterPlanetary File System, a decentralized storage system for sharing and retrieving data.
Frequently Asked Questions (FAQ)
1. What chains are supported by Khedra?
Khedra supports Ethereum mainnet and other EVM-compatible chains such as Sepolia and Gnosis. Additional chains can be added by configuring the TB_NODE_CHAINS
environment variable.
2. Do I need an RPC endpoint for every chain?
Yes, each chain you want to index or interact with requires a valid RPC endpoint specified in the .env
file.
3. Can I run Khedra without IPFS?
Yes, IPFS integration is optional and can be enabled or disabled using the --ipfs
command-line option.
References and Further Reading
- TrueBlocks GitHub Repository
- TrueBlocks Official Website
- Ethereum Developer Documentation
- IPFS Documentation
Index
- Address Monitoring: Chapter 4, Section "Monitoring Addresses"
- Advanced Operations: Chapter 5
- API Access: Chapter 4, Section "Accessing the REST API"
- Blockchain Indexing: Chapter 4, Section "Indexing Blockchains"
- Chains: Chapter 3, Section "Terminology and Concepts"
- Configuration Management: Chapter 4, Section "Managing Configurations"
- Glossary: Chapter 7, Section "Glossary of Terms"
- IPFS Integration: Chapter 5, Section "Integrating with IPFS"
- Logging and Debugging: Chapter 6, Section "Log Files and Debugging"
- RPC Endpoints: Chapter 2, Section "Initial Configuration"
- Troubleshooting: Chapter 6
Technical Specification
Purpose of this Document
This document defines the technical architecture, design, and functionalities of Khedra, enabling developers and engineers to understand its internal workings and design principles. For a less technical overview of the application, refer to the User Manual.
Intended Audience
This specification is for:
- Developers working on Khedra or integrating it into applications.
- System architects designing systems that use Khedra.
- Technical professionals looking for a detailed understanding of the system.
Scope and Objectives
The specification covers:
- High-level architecture.
- Core functionalities such as blockchain indexing, REST API, and address monitoring.
- Design principles, including scalability, error handling, and integration with IPFS.
- Supported chains, RPC requirements, and testing methodologies.
Introduction
System Architecture
High-Level Architecture Diagram
(Include a diagram here if needed. Replace this text with a Markdown-compatible diagram or a link to an image.)
graph TD config.go --> service.go service.go --> logging.go service.go --> chain.go chain.go --> validate.go validate.go --> general.go general.go --> testing.go chain_test.go --> chain.go validate_test.go --> validate.go logging_test.go --> logging.go general_test.go --> general.go config_test.go --> config.go
Key Components Overview
- Blockchain Indexer: Handles blockchain data collection and indexing.
- REST API Server: Exposes APIs for data access.
- IPFS Integrator: Manages decentralized storage.
- Configuration Manager: Parses
.env
files and other configurations.
Interactions Between Components
- The Blockchain Indexer collects data from RPC endpoints and stores it in the local database.
- The REST API retrieves indexed data and exposes it via endpoints.
- The IPFS Integrator uploads and pins indexed data to IPFS for decentralized access.
Core Functionalities
Blockchain Indexing
Indexes blockchain data for fast and efficient retrieval. Supports multiple chains and tracks transactions.
REST API
Exposes indexed data through a REST API. Includes endpoints for:
- Retrieving transactions and blocks.
- Accessing monitored address data.
Address Monitoring
Allows tracking of specific blockchain addresses. Captures transactions and updates in real-time.
IPFS Integration
Pins portions of the Unchained Index to IPFS for decentralized and tamper-proof storage.
Technical Design
Configuration Files and Environment Variables
Khedra uses a .env
file for configuration. Key variables include:
TB_NODE_DATAFOLDER
: Directory for storing data.TB_NODE_MAINNETRPC
: RPC endpoint for Ethereum mainnet.TB_NODE_CHAINS
: List of chains to index.
Initialization Process
- Validate
.env
configuration. - Connect to RPC endpoints for the specified chains.
- Initialize the blockchain index if necessary.
Data Flow and Processing
- Input: Blockchain data retrieved via RPC.
- Processing: Indexing, storing, and optionally pinning data to IPFS.
- Output: Indexed data accessible through the REST API.
Error Handling and Logging
Logs are written to the console with adjustable levels (Debug
, Info
, Warn
, Error
). Errors during initialization or RPC interactions are logged and reported.
Supported Chains
List of Supported Blockchains
Khedra supports Ethereum mainnet and other EVM-compatible chains like:
- Sepolia
- Gnosis
- Optimism
Requirements for RPC Endpoints
Each chain requires a valid RPC endpoint. For example:
TB_NODE_MAINNETRPC
: Mainnet RPC URL.TB_NODE_SEPOLIARPC
: Sepolia RPC URL.
Handling Multiple Chains
To enable multiple chains, set TB_NODE_CHAINS
in the .env
file:
TB_NODE_CHAINS="mainnet,sepolia,gnosis"
Ensure each chain has a corresponding RPC endpoint.
Command-Line Interface
Available Commands and Options
Initialization
./khedra --init all
- Options:
all
,blooms
,none
Scraper
./khedra --scrape on
- Enables or disables the blockchain scraper.
REST API
./khedra --api on
- Starts the API server.
Sleep Duration
./khedra --sleep 60
- Sets the duration (in seconds) between updates.
Detailed Behavior for Each Command
--init
: Controls how the blockchain index is initialized.--scrape
: Toggles the blockchain scraper.--api
: Starts or stops the API server.
Performance and Scalability
Performance Benchmarks
Khedra is designed to handle high-throughput blockchain data. Typical performance benchmarks include:
- Processing speed: ~500 blocks per second (depending on RPC response time).
- REST API response time: <50ms for standard queries.
Strategies for Handling Large-Scale Data
- Use high-performance RPC endpoints with low latency.
- Increase local storage capacity to handle large blockchain data.
- Scale horizontally by running multiple instances of Khedra for different chains.
Resource Optimization Guidelines
- Limit the number of chains processed simultaneously to reduce system load.
- Configure
--sleep
duration to balance processing speed with system resource usage.
Integration Points
Integration with External APIs
Khedra exposes data through a REST API, making it compatible with external applications. Example use cases:
- Fetching transaction details for a given address.
- Retrieving block information for analysis.
Interfacing with IPFS
Data indexed by Khedra can be pinned to IPFS for decentralized storage:
./khedra --ipfs on
Customizing for Specific Use Cases
Users can tailor the configuration by:
- Adjusting
.env
variables to include specific chains and RPC endpoints. - Writing custom scripts to query the REST API and process the data.
Testing and Validation
Unit Testing
Unit tests cover:
- Blockchain indexing logic.
- Configuration parsing and validation.
- REST API endpoint functionality.
Run tests with:
go test ./...
Integration Testing
Integration tests ensure all components work together as expected. Tests include:
- RPC connectivity validation.
- Multi-chain indexing workflows.
Testing Guidelines for Developers
- Use mock RPC endpoints for testing without consuming live resources.
- Validate
.env
configuration in test environments before deployment. - Automate tests with CI/CD pipelines to ensure reliability.
Appendices
Glossary of Technical Terms
- EVM: Ethereum Virtual Machine, the runtime environment for smart contracts.
- RPC: Remote Procedure Call, a protocol for interacting with blockchain nodes.
- IPFS: InterPlanetary File System, a decentralized storage solution.
References and Resources
- TrueBlocks GitHub Repository
- TrueBlocks Official Website
- Ethereum Developer Documentation
- IPFS Documentation
Index
- Address Monitoring: Section 3, Core Functionalities
- API Access: Section 3, Core Functionalities
- Architecture Overview: Section 2, System Architecture
- Blockchain Indexing: Section 3, Core Functionalities
- Configuration Files: Section 4, Technical Design
- Data Flow: Section 4, Technical Design
- Error Handling: Section 4, Technical Design
- Integration Points: Section 8, Integration Points
- IPFS Integration: Section 3, Core Functionalities; Section 8, Integration Points
- Logging: Section 4, Technical Design
- Performance Benchmarks: Section 7, Performance and Scalability
- REST API: Section 3, Core Functionalities; Section 8, Integration Points
- RPC Requirements: Section 5, Supported Chains and RPCs
- Scalability Strategies: Section 7, Performance and Scalability
- System Components: Section 2, System Architecture
- Testing Guidelines: Section 9, Testing and Validation