Comprehensive troubleshooting guide for the Cellframe SDK with diagnostic procedures, common problem resolution, performance optimization, and debugging techniques for blockchain application development. ## Overview This troubleshooting guide employs a systematic diagnostic approach to resolve common and complex issues encountered during Cellframe SDK development and deployment. Each section provides diagnostic procedures, root cause analysis, and comprehensive solutions with verification steps. **Troubleshooting Methodology:** - Systematic issue identification and categorization - Step-by-step diagnostic procedures with log analysis - Root cause analysis with comprehensive verification - Preventive measures and best practices - Performance optimization and monitoring techniques - Advanced debugging tools and techniques integration **Issue Categories:** - Build and installation problems with environment setup - Runtime errors and exception handling patterns - Network connectivity and protocol issues - Performance bottlenecks and resource optimization - Memory management and leak detection - Cryptographic operations and key management issues ## Table of Contents - [[#Build and Installation Issues|Build and Installation Issues]] - [[#Runtime Errors|Runtime Errors]] - [[#Network and Connectivity Issues|Network and Connectivity Issues]] - [[#Performance Problems|Performance Problems]] - [[#Memory Issues|Memory Issues]] - [[#Cryptographic Issues|Cryptographic Issues]] - [[#Configuration Problems|Configuration Problems]] - [[#Debugging Techniques|Debugging Techniques]] - [[#Logging and Monitoring|Logging and Monitoring]] - [[#Getting Help|Getting Help]] ## Build and Installation Issues ### CMake Configuration Failures **Problem**: CMake cannot find Cellframe SDK ``` CMake Error: Could not find a package configuration file provided by "cellframe-sdk" ``` **Solution**: ```bash # Check if pkg-config can find the SDK pkg-config --exists cellframe-sdk echo $? # Should return 0 # If not found, check installation pkg-config --list-all | grep cellframe # Set PKG_CONFIG_PATH if needed export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH # For custom installation paths export PKG_CONFIG_PATH=/opt/cellframe-sdk/lib/pkgconfig:$PKG_CONFIG_PATH ``` **Alternative CMake approach**: ```cmake # In CMakeLists.txt find_path(CELLFRAME_INCLUDE_DIR NAMES dap_common.h PATHS /usr/local/include /opt/cellframe-sdk/include PATH_SUFFIXES cellframe-sdk dap ) find_library(CELLFRAME_LIBRARY NAMES dap_core PATHS /usr/local/lib /opt/cellframe-sdk/lib ) if(CELLFRAME_INCLUDE_DIR AND CELLFRAME_LIBRARY) set(CELLFRAME_FOUND TRUE) endif() ``` ### Compiler Errors **Problem**: C11 standard not supported ``` error: 'for' loop initial declarations are only allowed in C99 or C11 mode ``` **Solution**: ```cmake # In CMakeLists.txt set(CMAKE_C_STANDARD 11) set(CMAKE_C_STANDARD_REQUIRED ON) # Or add compiler flags set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c11") ``` **Problem**: Missing headers ``` fatal error: dap_common.h: No such file or directory ``` **Solution**: ```bash # Check if headers are installed find /usr -name "dap_common.h" 2>/dev/null # Install development packages sudo apt-get install cellframe-sdk-dev # Debian/Ubuntu sudo dnf install cellframe-sdk-devel # Fedora/RHEL # Or build from source git clone https://gitlab.demlabs.net/cellframe/cellframe-sdk.git cd cellframe-sdk mkdir build && cd build cmake .. make -j$(nproc) sudo make install ``` ### Linking Errors **Problem**: Undefined references during linking ``` undefined reference to `dap_common_init` ``` **Solution**: ```cmake # Check library linking order target_link_libraries(my_app dap_core dap_crypto dap_io pthread m ) # For static linking target_link_libraries(my_app -Wl,--start-group ${CELLFRAME_LIBRARIES} -Wl,--end-group ) ``` **Problem**: Library not found at runtime ``` error while loading shared libraries: libdap_core.so: cannot open shared object file ``` **Solution**: ```bash # Check library path ldd my_app | grep dap # Add to library path export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH # Or configure system-wide echo "/usr/local/lib" | sudo tee /etc/ld.so.conf.d/cellframe.conf sudo ldconfig ``` ## Runtime Errors ### Initialization Failures **Problem**: DAP Core initialization fails ```c int ret = dap_common_init("MyApp", "app.log"); if (ret != 0) { // ret = -1 } ``` **Diagnosis**: ```c // Enable debug logging before init export DAP_DEBUG=1 // Check system resources #include <sys/resource.h> struct rlimit rl; getrlimit(RLIMIT_NOFILE, &rl); printf("File descriptor limit: %lu\n", rl.rlim_cur); ``` **Solutions**: 1. **Insufficient permissions**: ```bash # Ensure log directory is writable mkdir -p /var/log/myapp chmod 755 /var/log/myapp # Run with appropriate user sudo -u myapp ./my_app ``` 2. **Resource limits**: ```bash # Increase file descriptor limit ulimit -n 8192 # Or in /etc/security/limits.conf myapp soft nofile 8192 myapp hard nofile 16384 ``` ### Segmentation Faults **Problem**: Segfault during execution **Debugging steps**: ```bash # Generate core dump ulimit -c unlimited export MALLOC_CHECK_=2 # Run with debugger gdb ./my_app (gdb) run (gdb) bt # Show backtrace when crashed # Use Valgrind for memory debugging valgrind --tool=memcheck --leak-check=full ./my_app ``` **Common causes and solutions**: 1. **Null pointer dereference**: ```c // Problem dap_chain_datum_tx_t *tx = get_transaction(); size_t size = tx->header.tx_items_size; // tx might be NULL // Solution dap_chain_datum_tx_t *tx = get_transaction(); if (tx) { size_t size = tx->header.tx_items_size; } ``` 2. **Use after free**: ```c // Problem char *data = DAP_NEW_Z_SIZE(char, 1024); DAP_DELETE(data); strcpy(data, "hello"); // Use after free // Solution char *data = DAP_NEW_Z_SIZE(char, 1024); strcpy(data, "hello"); DAP_DELETE(data); data = NULL; // Prevent reuse ``` ### Memory Corruption **Problem**: Random crashes, corrupted data **Detection**: ```bash # Compile with debug flags cmake -DCMAKE_BUILD_TYPE=Debug .. # Enable memory debugging export MALLOC_CHECK_=2 export MALLOC_PERTURB_=42 # Use AddressSanitizer cmake -DCMAKE_C_FLAGS="-fsanitize=address" .. ``` **Common causes**: 1. **Buffer overflow**: ```c // Problem char buffer[64]; strcpy(buffer, very_long_string); // Overflow // Solution char buffer[64]; dap_strncpy(buffer, very_long_string, sizeof(buffer) - 1); buffer[sizeof(buffer) - 1] = '\0'; ``` 2. **Stack overflow**: ```c // Problem - large local arrays void function() { char huge_buffer[1024*1024]; // 1MB on stack } // Solution - use heap allocation void function() { char *buffer = DAP_NEW_Z_SIZE(char, 1024*1024); if (buffer) { // Use buffer DAP_DELETE(buffer); } } ``` ## Network and Connectivity Issues ### Connection Failures **Problem**: Cannot connect to remote nodes ``` Failed to connect to node: Connection refused ``` **Diagnosis**: ```bash # Test basic connectivity ping remote_node_ip telnet remote_node_ip 8089 # Check local firewall sudo iptables -L | grep 8089 sudo ufw status # Check if service is listening netstat -tlnp | grep 8089 ss -tlnp | grep 8089 ``` **Solutions**: 1. **Firewall configuration**: ```bash # Open required ports sudo ufw allow 8089/tcp sudo iptables -A INPUT -p tcp --dport 8089 -j ACCEPT ``` 2. **Network configuration**: ```ini # In configuration file [network] listen_address=0.0.0.0 listen_port=8089 external_address=auto max_connections=100 ``` ### SSL/TLS Issues **Problem**: SSL handshake failures ``` SSL_connect failed: certificate verify failed ``` **Solutions**: ```c // Disable certificate verification for testing (NOT for production) dap_client_set_ssl_verify(client, false); // For production, ensure proper certificates dap_cert_folder_add("/etc/ssl/certs"); ``` ### Performance Issues **Problem**: High latency, low throughput **Diagnosis**: ```c // Enable network statistics dap_stream_set_debug(true); // Monitor connection status typedef struct connection_stats { uint64_t bytes_sent; uint64_t bytes_received; uint32_t packets_lost; double latency_ms; } connection_stats_t; void monitor_connection(dap_stream_t *stream) { connection_stats_t stats = dap_stream_get_stats(stream); log_it_info("Sent: %lu, Received: %lu, Latency: %.2f ms", stats.bytes_sent, stats.bytes_received, stats.latency_ms); } ``` **Solutions**: 1. **Optimize buffer sizes**: ```ini [network] send_buffer_size=65536 recv_buffer_size=65536 keepalive_timeout=60 ``` 2. **Use compression**: ```c dap_stream_set_compression(stream, DAP_STREAM_COMPRESSION_GZIP); ``` ## Performance Problems ### High CPU Usage **Problem**: Application consuming too much CPU **Diagnosis**: ```bash # Monitor CPU usage top -p $(pidof my_app) htop # Profile with perf perf record -g ./my_app perf report # Use CPU profiler valgrind --tool=callgrind ./my_app kcachegrind callgrind.out.* ``` **Solutions**: 1. **Optimize hot paths**: ```c // Use profiling macros #define PROFILE_START(name) \ struct timespec start_##name; \ clock_gettime(CLOCK_MONOTONIC, &start_##name) #define PROFILE_END(name) \ struct timespec end_##name; \ clock_gettime(CLOCK_MONOTONIC, &end_##name); \ double elapsed = (end_##name.tv_sec - start_##name.tv_sec) * 1000.0 + \ (end_##name.tv_nsec - start_##name.tv_nsec) / 1000000.0; \ log_it_debug("Profile %s: %.3f ms", #name, elapsed) void process_data() { PROFILE_START(process_data); // ... processing code ... PROFILE_END(process_data); } ``` 2. **Reduce algorithm complexity**: ```c // Problem: O(n²) search for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { if (data[j].id == target_id) { // Found } } } // Solution: O(1) hash table lookup dap_hash_table_t *lookup = dap_hash_table_new(1000); // ... populate table ... data_t *found = dap_hash_table_lookup(lookup, &target_id); ``` ### High Memory Usage **Problem**: Excessive memory consumption **Diagnosis**: ```bash # Monitor memory usage ps aux | grep my_app pmap $(pidof my_app) # Memory profiler valgrind --tool=massif ./my_app ms_print massif.out.* ``` **Solutions**: 1. **Use memory pools**: ```c // For frequent allocations dap_mempool_t *pool = dap_mempool_create(sizeof(my_struct_t), 1000); my_struct_t *obj = dap_mempool_alloc(pool); // Use object dap_mempool_free(pool, obj); ``` 2. **Implement object recycling**: ```c typedef struct object_pool { my_object_t **free_objects; size_t free_count; size_t capacity; pthread_mutex_t mutex; } object_pool_t; my_object_t *object_pool_get(object_pool_t *pool) { pthread_mutex_lock(&pool->mutex); if (pool->free_count > 0) { my_object_t *obj = pool->free_objects[--pool->free_count]; pthread_mutex_unlock(&pool->mutex); return obj; } pthread_mutex_unlock(&pool->mutex); // Create new object if pool is empty return create_new_object(); } ``` ## Memory Issues ### Memory Leaks **Problem**: Memory usage grows over time **Detection**: ```bash # Use Valgrind valgrind --leak-check=full --show-leak-kinds=all ./my_app # Use AddressSanitizer export ASAN_OPTIONS=detect_leaks=1 ./my_app ``` **Common patterns**: 1. **Missing cleanup**: ```c // Problem char *data = DAP_NEW_Z_SIZE(char, 1024); if (error_condition) { return -1; // Leak: data not freed } DAP_DELETE(data); // Solution char *data = DAP_NEW_Z_SIZE(char, 1024); if (error_condition) { DAP_DELETE(data); return -1; } DAP_DELETE(data); ``` 2. **Reference cycles**: ```c // Problem: circular references typedef struct node { struct node *parent; struct node *child; } node_t; // Solution: use weak references or explicit cleanup void node_destroy(node_t *node) { if (node->child) { node->child->parent = NULL; // Break cycle node_destroy(node->child); } DAP_DELETE(node); } ``` ### Double Free Errors **Problem**: Attempting to free already freed memory ``` double free or corruption detected ``` **Prevention**: ```c // Set pointers to NULL after freeing #define SAFE_DELETE(ptr) do { \ if (ptr) { \ DAP_DELETE(ptr); \ ptr = NULL; \ } \ } while(0) // Usage char *data = DAP_NEW_Z_SIZE(char, 1024); SAFE_DELETE(data); SAFE_DELETE(data); // Safe - no double free ``` ## Cryptographic Issues ### Key Generation Failures **Problem**: Cannot generate cryptographic keys ```c dap_enc_key_t *key = dap_enc_key_new(DAP_ENC_KEY_TYPE_SIG_DILITHIUM); // key is NULL ``` **Diagnosis**: ```c // Check entropy availability FILE *random = fopen("/dev/random", "r"); if (!random) { log_it_error("No entropy source available"); } // Check algorithm support bool supported = dap_enc_key_type_supported(DAP_ENC_KEY_TYPE_SIG_DILITHIUM); if (!supported) { log_it_error("Dilithium not supported in this build"); } ``` **Solutions**: 1. **Ensure entropy**: ```bash # Install entropy daemon sudo apt-get install haveged sudo systemctl enable haveged sudo systemctl start haveged # Check entropy level cat /proc/sys/kernel/random/entropy_avail ``` 2. **Fallback to supported algorithms**: ```c const dap_sign_type_t fallback_types[] = { DAP_ENC_KEY_TYPE_SIG_DILITHIUM, DAP_ENC_KEY_TYPE_SIG_FALCON, DAP_ENC_KEY_TYPE_SIG_ECDSA }; dap_enc_key_t *create_key_with_fallback() { for (size_t i = 0; i < sizeof(fallback_types)/sizeof(fallback_types[0]); i++) { if (dap_enc_key_type_supported(fallback_types[i])) { dap_enc_key_t *key = dap_enc_key_new(fallback_types[i]); if (key) { return key; } } } return NULL; } ``` ### Signature Verification Failures **Problem**: Valid signatures failing verification **Common causes**: 1. **Data modification**: ```c // Ensure data integrity const char *original_data = "Hello World"; dap_chain_hash_fast_t hash; dap_hash_fast(original_data, strlen(original_data), &hash); // Later, verify data hasn't changed dap_chain_hash_fast_t verify_hash; dap_hash_fast(received_data, strlen(received_data), &verify_hash); if (memcmp(&hash, &verify_hash, sizeof(hash)) != 0) { log_it_error("Data has been modified"); } ``` 2. **Wrong key usage**: ```c // Ensure using correct key pair dap_enc_key_t *private_key = dap_enc_key_new(DAP_ENC_KEY_TYPE_SIG_DILITHIUM); dap_enc_key_t *public_key = dap_enc_key_get_public_key(private_key); // Sign with private key dap_sign_t *signature = dap_sign_create(private_key, data, data_len, 0); // Verify with public key (not private key) int result = dap_sign_verify(signature, public_key, data, data_len); ``` ## Configuration Problems ### Invalid Configuration Syntax **Problem**: Configuration not loading ``` Failed to parse configuration file ``` **Diagnosis**: ```c // Add detailed error reporting int ret = dap_config_init("./config"); if (ret != 0) { const char *error_msg = dap_config_get_last_error(); log_it_error("Config error: %s", error_msg); } ``` **Solutions**: 1. **Validate configuration syntax**: ```bash # Check for common issues grep -n '^\s*\[' config/app.cfg # Section headers grep -n '=' config/app.cfg | grep -v '^#' # Key-value pairs # Validate with config parser cellframe-node config validate config/app.cfg ``` 2. **Use configuration schema**: ```c typedef struct config_schema { const char *section; const char *key; const char *type; bool required; const char *default_value; } config_schema_t; config_schema_t app_schema[] = { {"general", "debug_mode", "bool", false, "false"}, {"network", "port", "uint16", true, NULL}, {"app", "name", "string", true, NULL}, {NULL, NULL, NULL, false, NULL} }; int validate_config(const char *config_path) { for (config_schema_t *schema = app_schema; schema->section; schema++) { if (schema->required) { const char *value = dap_config_get_item_str(schema->section, schema->key); if (!value) { log_it_error("Required config missing: [%s]%s", schema->section, schema->key); return -1; } } } return 0; } ``` ### Environment Variable Issues **Problem**: Environment variables not overriding config **Solution**: ```c // Implement environment variable override const char *get_config_value(const char *section, const char *key, const char *default_val) { // Try environment variable first char env_var[256]; snprintf(env_var, sizeof(env_var), "CELLFRAME_%s_%s", str_to_upper(section), str_to_upper(key)); const char *env_value = getenv(env_var); if (env_value) { return env_value; } // Fall back to config file const char *config_value = dap_config_get_item_str(section, key); return config_value ? config_value : default_val; } ``` ## Debugging Techniques ### Debug Logging **Enable detailed logging**: ```c // Set debug log level dap_log_level_set(L_DEBUG); // Enable specific module debugging #define LOG_TAG "MYMODULE" log_it_debug("Debug message: %s", debug_info); // Conditional debugging #ifdef DEBUG_CRYPTO log_it_debug("Crypto operation: type=%d, size=%zu", type, size); #endif ``` **Environment-based debugging**: ```bash # Enable debug mode export DAP_DEBUG=1 export DAP_LOG_LEVEL=DEBUG # Module-specific debugging export DAP_DEBUG_CRYPTO=1 export DAP_DEBUG_NETWORK=1 ``` ### Core Dumps **Configure core dumps**: ```bash # Enable core dumps ulimit -c unlimited # Set core dump pattern echo '/tmp/core.%e.%p.%t' | sudo tee /proc/sys/kernel/core_pattern # Analyze core dump gdb ./my_app /tmp/core.my_app.12345.1640995200 (gdb) bt (gdb) info registers (gdb) print variable_name ``` ### Live Debugging **Attach to running process**: ```bash # Find process ID ps aux | grep my_app # Attach debugger gdb -p 12345 (gdb) continue # Trigger issue, then Ctrl+C (gdb) bt (gdb) info threads ``` **Remote debugging**: ```bash # On target machine gdbserver :9999 ./my_app # On development machine gdb (gdb) target remote target_ip:9999 (gdb) continue ``` ## Logging and Monitoring ### Application Metrics **Implement metrics collection**: ```c typedef struct app_metrics { atomic_uint64_t requests_total; atomic_uint64_t requests_failed; atomic_uint64_t bytes_processed; time_t start_time; } app_metrics_t; static app_metrics_t g_metrics = {0}; void metrics_init() { g_metrics.start_time = time(NULL); } void metrics_increment_requests() { atomic_fetch_add(&g_metrics.requests_total, 1); } void metrics_report() { uint64_t requests = atomic_load(&g_metrics.requests_total); uint64_t failed = atomic_load(&g_metrics.requests_failed); uint64_t bytes = atomic_load(&g_metrics.bytes_processed); time_t uptime = time(NULL) - g_metrics.start_time; log_it_info("Metrics: requests=%lu, failed=%lu, bytes=%lu, uptime=%lds", requests, failed, bytes, uptime); } ``` ### Health Checks **Implement health monitoring**: ```c typedef struct health_status { bool database_ok; bool network_ok; bool memory_ok; uint64_t last_activity; } health_status_t; health_status_t check_application_health() { health_status_t health = {0}; // Check database health.database_ok = test_database_connection(); // Check network health.network_ok = test_network_connectivity(); // Check memory usage struct rusage usage; getrusage(RUSAGE_SELF, &usage); health.memory_ok = (usage.ru_maxrss < MAX_MEMORY_MB * 1024); health.last_activity = time(NULL); return health; } ``` ### Log Analysis **Structured logging**: ```c // JSON-formatted logs for easy parsing #define LOG_JSON(level, event, ...) \ log_it(level, "{\"timestamp\":%lu,\"event\":\"%s\",\"data\":{" __VA_ARGS__ "}}", \ time(NULL), event) // Usage LOG_JSON(L_INFO, "transaction_processed", "\"tx_id\":\"%s\",\"amount\":%lu,\"fee\":%lu", tx_id, amount, fee); ``` **Log monitoring script**: ```bash #!/bin/bash # monitor_logs.sh tail -f /var/log/myapp/app.log | while read line; do if echo "$line" | grep -q "ERROR"; then echo "Alert: $line" | mail -s "Application Error" [email protected] fi done ``` ## Getting Help ### Documentation Resources 1. **Official Documentation**: - **[[Cellframe SDK Reference|Main Documentation]]** - Complete documentation hub - **[[ETC/Architecture Overview|Architecture Guide]]** - System architecture overview - **[[ETC/Development Guide|Development Best Practices]]** - Development methodology 2. **Module References**: - **[[Modules/Module DAP Core|Core Module]]** - System initialization and utilities - **[[Modules/Module DAP Crypto|Cryptographic Functions]]** - Cryptographic operations - **[[Modules/Module Common|Blockchain Components]]** - Blockchain data structures - **[[ETC/Glossary|Function Index]]** - Complete alphabetical function reference ### Community Support 1. **Issue Reporting**: ```markdown # Bug Report Template ## Environment - OS: Ubuntu 20.04 LTS - Cellframe SDK Version: 3.4-0 - Compiler: GCC 9.4.0 ## Steps to Reproduce 1. Initialize application with dap_common_init() 2. Create encryption key with dap_enc_key_new() 3. Application crashes ## Expected Behavior Key should be created successfully ## Actual Behavior Segmentation fault in dap_enc_key_new() ## Logs [ERROR] dap_enc_key_new(): Failed to allocate memory ## Additional Context - Application works on Ubuntu 18.04 - Issue started after upgrading to SDK 3.4-0 ``` 2. **Performance Reports**: ```markdown # Performance Issue Template ## Environment - Hardware: 8 CPU cores, 16GB RAM - Load: Processing 1000 transactions/second - Issue: High CPU usage (>90%) ## Profiling Data - Hot function: process_transaction() (60% CPU time) - Memory usage: 2GB (stable) - Network I/O: Normal ## Configuration [network] worker_threads=8 buffer_size=65536 ``` ### Diagnostic Information **System information script**: ```bash #!/bin/bash # collect_diagnostic_info.sh echo "=== System Information ===" uname -a cat /etc/os-release echo "=== Hardware Information ===" nproc free -h df -h echo "=== Cellframe SDK Information ===" pkg-config --modversion cellframe-sdk 2>/dev/null || echo "SDK not found via pkg-config" find /usr -name "libdap_core.so*" 2>/dev/null | head -5 echo "=== Process Information ===" ps aux | grep -E "(cellframe|dap)" | grep -v grep echo "=== Network Information ===" netstat -tlnp | grep -E "(8089|8080)" echo "=== Log Files ===" find /var/log -name "*cellframe*" -o -name "*dap*" 2>/dev/null | head -10 ``` ### Emergency Procedures **Service recovery**: ```bash #!/bin/bash # emergency_recovery.sh # Stop application killall -TERM my_app sleep 5 killall -KILL my_app 2>/dev/null # Check for corrupted data if [ -f /var/lib/myapp/data.db ]; then sqlite3 /var/lib/myapp/data.db "PRAGMA integrity_check;" fi # Restart with debug logging export DAP_LOG_LEVEL=DEBUG ./my_app --safe-mode 2>&1 | tee /tmp/recovery.log ``` **Data backup**: ```bash #!/bin/bash # backup_data.sh BACKUP_DIR="/backup/$(date +%Y%m%d_%H%M%S)" mkdir -p "$BACKUP_DIR" # Backup configuration cp -r /etc/myapp "$BACKUP_DIR/config" # Backup data cp -r /var/lib/myapp "$BACKUP_DIR/data" # Backup logs cp -r /var/log/myapp "$BACKUP_DIR/logs" echo "Backup created: $BACKUP_DIR" ``` --- *See also: **[[ETC/Development Guide|Development Guide]]**, **[[ETC/Services Overview|Services Overview]]**, **[[ETC/First Application|First Application]]***