Organisation Profile

PROGRAMMERS' PARADISE TECHNICAL SOCIETY

Programmers' Paradise has evolved into a full-fledged Technical Society. We are a vibrant and inclusive community dedicated to fostering innovation, technical excellence, and collaborative learning among students. Whether you're diving into programming for the first time or you're a seasoned developer, there's something here for everyone. Our mission is to empower students through technical skill development, peer collaboration, and hands-on experience. We aim to provide a supportive platform for learning, creating, and growing in diverse areas of technology through events, workshops, and projects.

TypeScriptHTMLJavaScriptPython

Open Website

Chat not available

Open GitHub

MENTORS

Aksh Agrawal

PROJECTS

Ak-dskit (DsKit) - Unified Data Science & ML Toolkit

PythonNumPyPandasScikit-learnMatplotlibSeabornPlotlySHAPXGBoostLightGBMCatBoostOptunaHyperoptPyPI

Problem Statement

Data scientists and ML engineers often spend 60-80% of their time on repetitive tasks like data cleaning, EDA, preprocessing, and basic modeling. There's a need for a unified, production-ready toolkit that wraps complex operations into simple 1-line commands while maintaining flexibility and comprehensive feature coverage. Ak-dskit solves this by providing 221+ wrapper functions that streamline the entire ML pipeline - from data loading to model deployment - making data science accessible, efficient, and production-ready. What Students Will Work On: • Beginner Level: 1. Writing documentation and tutorials 2. Creating example notebooks 3. Adding unit tests for existing functions 4. Fixing bugs and improving error handling 5. Enhancing function docstrings • Intermediate Level: 1. Implementing new preprocessing methods 2. Adding visualization functions 3. Creating data validation utilities 4. Building feature engineering tools 5. Developing data profiling capabilities • Advanced Level: 1. Implementing advanced AutoML features 2. Building custom ML algorithms 3. Creating model deployment pipelines 4. Developing distributed computing support 5. Implementing neural network wrappers Expected Outcomes: 1. At least 10-15 new utility functions added to the library 2. Comprehensive test coverage (target: >80%) 3. 5+ tutorial notebooks demonstrating real-world use cases 4. Improved documentation with API reference updates 5. Performance optimization for core functions 6. New features like automated report generation, data drift detection, or A/B testing utilities

Focus Area

Extending data cleaning and preprocessing capabilitie • Adding new feature engineering methods • Improving automated EDA functions • Enhancing model validation utilities • Visualization & Explainability • Creating new interactive visualization functions • Improving SHAP integration for model explainability • Building custom plotting utilities for specific use cases • Developing hyperplane visualization for advanced algorithms • AutoML & Optimization • Expanding hyperparameter tuning capabilities • Adding support for new ML algorithms • Implementing ensemble methods • Building automated feature selection tools • Documentation & Testing • Writing comprehensive tutorials and guides • Creating Jupyter notebook examples • Developing unit tests for existing functions • Improving API documentation • DevOps & Deployment • Setting up CI/CD pipelines • Creating Docker containers for deployment • Building model serving utilities • Implementing monitoring and logging features • Database & Time Series • Enhancing database utility functions • Expanding time series analysis capabilities • Adding support for new data sources • Building data versioning tools • Student Contribution Guide (Idea Page)

Annie

RustPyO3PythonML

Problem Statement

A lightning-fast, Rust-powered Approximate Nearest Neighbour library for Python with multiple backends, thread-safety, and GPU acceleration. Core Features: • Multiple Backends: ◦ Brute-force (exact) with SIMD acceleration ◦ HNSW (approximate) for large-scale datasets • Multiple Distance Metrics: Euclidean, Cosine, Manhattan, Chebyshev • Batch Queries for efficient processing • Thread-safe indexes with concurrent access • Zero-copy NumPy integration • On-disk Persistence with serialization • Filtered Search with custom Python callbacks • GPU Acceleration for brute-force calculations • Multi-platform support (Linux, Windows, macOS) • Automated CI with performance tracking

Focus Area

Student need to work on 'issues' present in the repository.

Ready to collaborate?

Join the community chat, review the issue tracker, and pick a project to start contributing. Mentors are available to help you scope your first patch.

Chat not available

Open GitHub