Testing Guide

Testing Guide¶

This document provides comprehensive information about testing the ByteBiota system, including unit tests, integration tests, and User Acceptance Tests (UAT) for the distributed architecture.

Test Suite Overview¶

The ByteBiota test suite is organized into several categories:

1. Unit Tests¶

Purpose: Test individual components in isolation
Coverage: Core simulation components, opcodes, memory management
Duration: Fast (< 1 minute)
Dependencies: Minimal (mostly mocked)

Distributed smoke run {#distributed-smoke}¶

File: tests/test_distributed_integration.py
Goal: Spin up an in-process server plus lightweight worker and verify assignment flow, environment sync, and checkpoint round trips.
Runtime: ~70 seconds.
When to run: Pre-commit for changes touching worker execution, state aggregation, or checkpoint coordination.
Key assertions:
Workers register, receive assignments, and return execution payloads without timing out.
StateAggregator merges births/deaths and maintains monotonic global step counters.
Checkpoints created by CheckpointService can be reopened and match live simulation stats.

Worker regression bundle {#distributed-worker-regression}¶

Files: tests/test_distributed_worker.py, tests/test_distributed_server.py
Goal: Exercise LocalExecutor time-slice mechanics, batching heuristics, heartbeat handling, and error recovery without launching subprocesses.
Runtime: ~90 seconds combined.
When to run: Any change to LocalExecutor, WorkerManager, mutation handling, or OS call wiring.
Key assertions:
Mutation counters, energy debits, and reproduction bookkeeping remain consistent across assignments.
Worker reconnection paths resynchronize executor snapshots after transient failures.
Server APIs report coherent worker stats, seed bank deltas, and tuning payload status.

2. Integration Tests¶

Purpose: Test component interactions and workflows
Coverage: Server-worker communication, work assignment, state synchronization
Duration: Medium (1-3 minutes)
Dependencies: Some real components, mocked external dependencies

3. End-to-End Tests¶

Purpose: Test complete system functionality
Coverage: Full server-worker lifecycle, API endpoints, real communication
Duration: Longer (3-10 minutes)
Dependencies: Full system components, network communication

4. User Acceptance Tests (UAT)¶

Purpose: Validate entire distributed system from user perspective
Coverage: Server-worker coordination, simulation evolution, failure recovery, API correctness
Duration: Long (5-15 minutes)
Dependencies: Full system, multiple processes, temporary test data

User Acceptance Testing (UAT)¶

Overview¶

The UAT suite validates the distributed ByteBiota system from a user perspective, ensuring that:
- Server and workers start correctly and coordinate properly
- Simulation evolves realistically over time
- All API endpoints return correct and consistent data
- The system recovers gracefully from failures
- Workers can join and leave dynamically
- Checkpoint data is valid and recovery works

UAT Test Structure¶

tests/
├── test_uat_distributed.py    # Main UAT test suite
├── uat_helpers.py            # Helper functions and fixtures
├── fixtures/
│   └── test_checkpoint_template.json  # Template for test checkpoints
└── run_uat_tests.py          # Dedicated UAT test runner

UAT Test Categories¶

A. Server-Worker Startup and Coordination¶

TestServerWorkerStartup: Basic startup, worker registration, multiple workers, heartbeat mechanism
Validates that server starts cleanly and workers register successfully
Ensures heartbeat mechanism works and server detects alive workers

B. Simulation Evolution Tests¶

TestSimulationEvolution: Simulation starts and evolves, distributed evolution consistency
Runs simulation for 2-5 minutes to verify organisms replicate and evolve
Compares distributed results with expected evolution patterns

C. API Data Correctness¶

TestAPIDataCorrectness: All key API endpoints return correct and consistent data
Tests /api/simulation/stats, /api/workers/stats, /api/distributed/overview, etc.
Cross-validates data consistency across multiple endpoints

D. Monitoring Data Validation¶

TestMonitoringDataValidation: Monitoring and analytics endpoint validation
Tests chart data format, worker details, trend analysis
Ensures data format matches what web UI expects

E. Server Restart Recovery¶

TestServerRestartRecovery: Graceful and crash recovery scenarios
Tests server restart with worker reconnection
Validates simulation state recovery from checkpoints

F. Worker Dynamics¶

TestWorkerDynamics: Workers joining, leaving, and crash handling
Tests dynamic worker management
Ensures simulation continues with worker changes

G. Checkpoint Validation¶

TestCheckpointValidation: Checkpoint creation, integrity, and recovery
Tests checkpoint data structure and recovery from checkpoints
Validates checkpoint data matches live simulation state

H. Performance and Load¶

TestPerformanceAndLoad: System performance under load
Tests concurrent API requests and resource limits
Validates system handles multiple workers efficiently

Running UAT Tests¶

Quick Start¶

# Run all UAT tests
python run_uat_tests.py

# Run specific test categories
python run_uat_tests.py --category startup evolution

# Run with custom configuration
python run_uat_tests.py --duration 300 --workers 5

# Run with detailed reporting
python run_uat_tests.py --verbose --coverage --report

# Run quick tests only (skip long evolution tests)
python run_uat_tests.py --quick

Using pytest directly¶

# Run all UAT tests
pytest tests/test_uat_distributed.py

# Run specific test class
pytest tests/test_uat_distributed.py::TestServerWorkerStartup

# Run specific test method
pytest tests/test_uat_distributed.py::TestServerWorkerStartup::test_server_startup

# Run with verbose output
pytest tests/test_uat_distributed.py -v

UAT Configuration¶

Test Configuration Options¶

Test Duration: Default 120 seconds for evolution tests
Number of Workers: Default 3 workers for multi-worker tests
Server Port: Default 9090 (avoids conflicts with production)
Resource Limits: Conservative limits for test workers
Checkpoint Interval: Fast checkpointing (30 seconds) for tests

Environment Variables¶

UAT tests can be configured via environment variables:

export UAT_TEST_DURATION=300      # Test duration in seconds
export UAT_NUM_WORKERS=5          # Number of workers
export UAT_START_PORT=9090        # Starting port for servers
export UAT_DEBUG=true             # Enable debug mode
export UAT_KEEP_TEMP=true         # Keep temp directories

UAT Test Data¶

Test Checkpoints¶

UAT tests use temporary checkpoint directories that are:
- Created fresh for each test run
- Populated with test data during simulation
- Cleaned up automatically after tests
- Never impact production data

Test Data Validation¶

UAT tests validate:
- Data Structure: JSON schema validation for API responses
- Data Consistency: Cross-validation across multiple endpoints
- Value Ranges: Metrics within expected ranges
- Evolution Progress: Organisms replicate and evolve over time

UAT Success Criteria¶

Startup Tests¶

Server responds to health checks within 5 seconds
Workers register within 10 seconds
All API endpoints return 200 OK

Evolution Tests¶

Population > 1 after 2 minutes
Step count increases monotonically
Mutations occur (copy_bit_flips > 0)
Organisms replicate (total_reaped increases)

Data Consistency¶

Worker organism counts sum to global population
Memory usage matches organism sizes
Checkpoint data equals live API data (within timing tolerance)

Recovery Tests¶

Workers reconnect within 30 seconds of server restart
Zero organism loss after graceful shutdown
Simulation continues with <5% population variance after recovery

Resilience Tests¶

Server continues with n-1 workers after one fails
Worker dynamics: System stable with workers joining/leaving
No crashes or exceptions in logs

UAT Troubleshooting¶

Common Issues¶

Port Conflicts

Error: Address already in use

Solution: UAT tests use ports 9090-9095. Check for running processes:

lsof -i :9090-9095

Worker Registration Timeout

Error: Workers failed to register

Solution: Increase timeout or check server startup logs:

python run_uat_tests.py --debug --keep-temp

Evolution Test Failures

Error: No mutations detected

Solution: Increase test duration or check simulation parameters:

python run_uat_tests.py --duration 300

Debug Mode¶

Run UAT tests with debug output:

python run_uat_tests.py --debug --verbose --keep-temp

This will:
- Show detailed process logs
- Keep temporary directories for inspection
- Provide verbose test output
- Enable additional logging

UAT Reporting¶

Test Reports¶

UAT tests generate several types of reports:

HTML Report: Detailed test results with timing and assertions
Coverage Report: Code coverage for distributed components
Summary Report: JSON summary with configuration and results
Console Output: Real-time test progress and results

Report Locations¶

Reports are saved to uat_reports/ directory by default:

uat_reports/
├── uat_report_<timestamp>.html      # HTML test report
├── coverage_html/                   # Coverage report
├── uat_summary_<timestamp>.json     # Summary report
└── temp_directories/                # Debug temp dirs (if --keep-temp)

UAT Best Practices¶

Test Isolation¶

Each UAT test runs in complete isolation:
- Fresh temporary directories
- Clean server/worker processes
- Reset state between tests
- No shared resources

Test Data Management¶

Use temporary directories for all test data
Never modify production checkpoints
Clean up all temporary files
Validate data structure before use

Performance Considerations¶

Use conservative resource limits for test workers
Run evolution tests for appropriate duration
Monitor system resources during tests
Use parallel execution for independent tests

Continuous Integration¶

UAT tests are designed for CI environments:
- Minimal external dependencies
- Configurable timeouts and resource limits
- Parallel execution support
- Comprehensive error reporting

Integration with Existing Tests¶

Test Hierarchy¶

ByteBiota Test Suite
├── Unit Tests (fast, isolated)
├── Integration Tests (medium, component interaction)
├── End-to-End Tests (long, full system)
└── User Acceptance Tests (comprehensive, user perspective)

Running All Tests¶

# Run all test types
pytest tests/ -v

# Run specific test types
pytest tests/test_*.py -v                    # Unit tests
pytest tests/test_distributed_*.py -v        # Integration tests
pytest tests/test_uat_distributed.py -v      # UAT tests

# Run with coverage
pytest tests/ --cov=src.bytebiota --cov-report=html

Test Dependencies¶

Required Packages¶

pip install pytest pytest-asyncio pytest-mock requests
pip install pytest-xdist pytest-cov pytest-html  # Optional

System Requirements¶

Python 3.8+
Sufficient system resources for multiple processes
Network access for local server communication
Temporary directory write permissions

Contributing to Tests¶

Adding New UAT Tests¶

Choose the right category: Add to existing test class or create new one
Use existing fixtures: Leverage uat_helpers.py fixtures
Follow test patterns: Arrange-Act-Assert structure
Add appropriate markers: Use @pytest.mark.slow for long tests
Update documentation: Add test description to this file

Test Quality Guidelines¶

Isolation: Tests should not depend on each other
Deterministic: Tests should produce consistent results
Fast: Unit tests should complete quickly
Clear: Test names and assertions should be self-documenting
Comprehensive: Cover both success and failure scenarios

Wiki Integration¶

All test code includes @wiki tags linking to this documentation:

# @wiki: wiki/testing.md#uat-startup-tests
class TestServerWorkerStartup:
    """Test server-worker startup and coordination."""

This ensures documentation stays synchronized with code changes.

Conclusion¶

The ByteBiota test suite provides comprehensive coverage from unit tests to user acceptance tests. The UAT suite specifically validates the distributed system from a user perspective, ensuring reliability, correctness, and resilience across all scenarios.

For questions or issues with the test suite, please refer to the troubleshooting sections or create an issue in the project repository.