Core Functionalities
This section details Khedra's primary technical functionalities, explaining how each core feature is implemented and the technical approaches used.
Blockchain Indexing
The Unchained Index
The Unchained Index is the foundational data structure of Khedra, providing a reverse-lookup capability from addresses to their appearances in blockchain data.
Technical Implementation
The index is implemented as a specialized data structure with these key characteristics:
- Bloom Filter Front-End: A probabilistic data structure that quickly determines if an address might appear in a block
- Address-to-Appearance Mapping: Maps each address to a list of its appearances
- Chunked Storage: Divides the index into manageable chunks (typically 1,000,000 blocks per chunk)
- Versioned Format: Includes version metadata to handle format evolution
// Simplified representation of the index structure
type UnchainedIndex struct {
Version string
Chunks map[uint64]*IndexChunk // Key is chunk ID
}
type IndexChunk struct {
BloomFilter *BloomFilter
Appearances map[string][]Appearance // Key is hex address
StartBlock uint64
EndBlock uint64
LastUpdated time.Time
}
type Appearance struct {
BlockNumber uint64
TransactionIndex uint16
AppearanceType uint8
LogIndex uint16
}
Indexing Process
- Block Retrieval: Fetch blocks from the RPC endpoint in configurable batches
- Appearance Extraction: Process each block to extract address appearances from:
- Transaction senders and recipients
- Log topics and indexed parameters
- Trace calls and results
- State changes
- Deduplication: Remove duplicate appearances within the same transaction
- Storage: Update the appropriate index chunk with the new appearances
- Bloom Filter Update: Update the bloom filter for quick future lookups
Performance Optimizations
- Parallel Processing: Multiple blocks processed concurrently
- Bloom Filters: Fast negative lookups to avoid unnecessary disk access
- Binary Encoding: Compact storage format for index data
- Caching: Frequently accessed index portions kept in memory
Address Monitoring
Monitor Implementation
The monitoring system tracks specific addresses for on-chain activity and provides notifications when activity is detected.
Technical Implementation
Monitors are implemented using these components:
- Monitor Registry: Central store of all monitored addresses
- Address Index: Fast lookup structure for monitored addresses
- Activity Tracker: Records and timestamps address activity
- Notification Manager: Handles alert distribution based on configuration
// Simplified monitor implementation
type Monitor struct {
Address string
Description string
CreatedAt time.Time
LastActivity time.Time
Config MonitorConfig
ActivityLog []Activity
}
type MonitorConfig struct {
NotificationChannels []string
Filters *ActivityFilter
Thresholds map[string]interface{}
}
type Activity struct {
BlockNumber uint64
TransactionHash string
Timestamp time.Time
ActivityType string
Details map[string]interface{}
}
Monitoring Process
- Registration: Add addresses to the monitor registry
- Block Processing: As new blocks are processed, check for monitored addresses
- Activity Detection: When a monitored address appears, record the activity
- Notification: Based on configuration, send notifications via configured channels
- State Update: Update the monitor's state with the new activity
Optimization Approaches
- Focused Index: Maintain a separate index for just monitored addresses
- Early Detection: Check monitored addresses early in the processing pipeline
- Configurable Sensitivity: Allow users to set thresholds for notifications
- Batched Notifications: Group notifications to prevent excessive alerts
API Service
RESTful Interface
The API service provides HTTP endpoints for querying indexed data and managing Khedra's operations.
Technical Implementation
The API is implemented using these components:
- HTTP Server: Handles incoming requests and routing
- Route Handlers: Process specific endpoint requests
- Authentication Middleware: Optional API key verification
- Response Formatter: Structures data in requested format (JSON, CSV, etc.)
- Documentation: Auto-generated Swagger documentation
// Simplified API route implementation
type APIRoute struct {
Path string
Method string
Handler http.HandlerFunc
Description string
Params []Parameter
Responses map[int]Response
}
// API server initialization
func NewAPIServer(config Config) *APIServer {
server := &APIServer{
router: mux.NewRouter(),
port: config.Port,
auth: config.Auth,
}
server.registerRoutes()
return server
}
API Endpoints
The API provides endpoints in several categories:
- Status Endpoints: System and service status information
- Index Endpoints: Query the Unchained Index for address appearances
- Monitor Endpoints: Manage and query address monitors
- Chain Endpoints: Blockchain information and operations
- Admin Endpoints: Configuration and management operations
Performance Considerations
- Connection Pooling: Reuse connections for efficiency
- Response Caching: Cache frequent queries with appropriate invalidation
- Pagination: Limit response sizes for large result sets
- Query Optimization: Efficient translation of API queries to index lookups
- Rate Limiting: Prevent resource exhaustion from excessive requests
IPFS Integration
Distributed Index Sharing
The IPFS integration enables sharing and retrieving index chunks through the distributed IPFS network.
Technical Implementation
The IPFS functionality is implemented with these components:
- IPFS Node: Either embedded or external IPFS node connection
- Chunk Manager: Handles breaking the index into shareable chunks
- Publishing Logic: Manages uploading chunks to IPFS
- Discovery Service: Finds and retrieves chunks from the network
- Validation: Verifies the integrity of downloaded chunks
// Simplified IPFS service implementation
type IPFSService struct {
node *ipfs.CoreAPI
chunkManager *ChunkManager
config IPFSConfig
}
type ChunkManager struct {
chunkSize uint64
validationFunc func([]byte) bool
storage *Storage
}
Distribution Process
- Chunking: Divide the index into manageable chunks with metadata
- Publishing: Add chunks to IPFS and record their content identifiers (CIDs)
- Announcement: Share availability information through the network
- Discovery: Find chunks needed by querying the IPFS network
- Retrieval: Download needed chunks from peers
- Validation: Verify chunk integrity before integration
Optimization Strategies
- Incremental Updates: Share only changed or new chunks
- Prioritized Retrieval: Download most useful chunks first
- Peer Selection: Connect to reliable peers for better performance
- Background Syncing: Retrieve chunks in the background without blocking
- Compressed Storage: Minimize bandwidth and storage requirements
Configuration Management
Flexible Configuration System
Khedra's configuration system provides multiple ways to configure the application, with clear precedence rules.
Technical Implementation
The configuration system is implemented with these components:
- YAML Parser: Reads the configuration file format
- Environment Variable Processor: Overrides from environment variables
- Validation Engine: Ensures configuration values are valid
- Defaults Manager: Provides sensible defaults where needed
- Runtime Updater: Handles configuration changes during operation
// Simplified configuration structure
type Config struct {
General GeneralConfig
Chains map[string]ChainConfig
Services map[string]ServiceConfig
Logging LoggingConfig
}
// Configuration loading process
func LoadConfig(path string) (*Config, error) {
config := DefaultConfig()
// Load from file if exists
if fileExists(path) {
if err := loadFromFile(path, config); err != nil {
return nil, err
}
}
// Override with environment variables
applyEnvironmentOverrides(config)
// Validate the final configuration
if err := validateConfig(config); err != nil {
return nil, err
}
return config, nil
}
Configuration Sources
The system processes configuration from these sources, in order of precedence:
- Environment Variables: Highest precedence, override all other sources
- Configuration File: User-provided settings in YAML format
- Default Values: Built-in defaults for unspecified settings
Validation Rules
The configuration system enforces these kinds of validation:
- Type Validation: Ensures values have the correct data type
- Range Validation: Numeric values within acceptable ranges
- Format Validation: Strings matching required patterns (e.g., URLs)
- Dependency Validation: Related settings are consistent
- Resource Validation: Settings are compatible with available resources
These core functionalities form the technical foundation of Khedra, enabling its primary capabilities while providing the flexibility and performance required for blockchain data processing.