73 lines
2.9 KiB
Markdown
73 lines
2.9 KiB
Markdown
# Changelog
|
|
|
|
All notable changes to this project will be documented in this file.
|
|
|
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
|
|
## [0.2.0] - 2026-01-05
|
|
|
|
### Added
|
|
- Complete data pipeline implementation
|
|
- Database connection and session management with SQLAlchemy
|
|
- ORM models for 5 tables (OHLCVData, DetectedPattern, PatternLabel, SetupLabel, Trade)
|
|
- Repository pattern implementation (OHLCVRepository, PatternRepository)
|
|
- Data loaders for CSV, Parquet, and Database sources with auto-detection
|
|
- Data preprocessors (missing data handling, duplicate removal, session filtering)
|
|
- Data validators (OHLCV validation, continuity checks, outlier detection)
|
|
- Pydantic schemas for type-safe data validation
|
|
- Utility scripts:
|
|
- `setup_database.py` - Database initialization
|
|
- `download_data.py` - Data download/conversion
|
|
- `process_data.py` - Batch data processing with CLI
|
|
- `validate_data_pipeline.py` - Comprehensive validation suite
|
|
- Integration tests for database operations
|
|
- Unit tests for all data pipeline components (21 tests total)
|
|
|
|
### Features
|
|
- Connection pooling for database (configurable pool size and overflow)
|
|
- SQLite and PostgreSQL support
|
|
- Timezone-aware session filtering (3-4 AM EST trading window)
|
|
- Batch insert optimization for database operations
|
|
- Parquet format support for 10x faster loading
|
|
- Comprehensive error handling with custom exceptions
|
|
- Detailed logging for all data operations
|
|
|
|
### Tests
|
|
- 21/21 tests passing (100% success rate)
|
|
- Test coverage: 59% overall, 84%+ for data module
|
|
- SQLAlchemy 2.0 compatibility ensured
|
|
- Proper test isolation with unique timestamps
|
|
|
|
### Validated
|
|
- Successfully processed real data: 45,801 rows → 2,575 session rows
|
|
- Database operations working with connection pooling
|
|
- All data loaders, preprocessors, and validators tested with real data
|
|
- Validation script: 7/7 checks passing
|
|
|
|
### Documentation
|
|
- V0.2.0_DATA_PIPELINE_COMPLETE.md - Comprehensive completion guide
|
|
- Updated all module docstrings with Google-style format
|
|
- Added usage examples in utility scripts
|
|
|
|
## [0.1.0] - 2026-01-XX
|
|
|
|
### Added
|
|
- Project foundation with complete directory structure
|
|
- Comprehensive logging system with JSON and console formatters
|
|
- Configuration management with YAML and environment variable support
|
|
- Custom exception hierarchy for error handling
|
|
- Core constants and enums for pattern types and trading concepts
|
|
- Base classes for detectors and models
|
|
- Initial test suite with pytest
|
|
- Development tooling (black, flake8, mypy, pre-commit hooks)
|
|
- Documentation structure
|
|
|
|
### Infrastructure
|
|
- Git repository initialization
|
|
- Requirements files for production and development
|
|
- Setup.py and pyproject.toml for package management
|
|
- Makefile for common commands
|
|
- .gitignore with comprehensive patterns
|
|
- Environment variable template (.env.example)
|