%% Begin Waypoint %% - **_Misc** - **_Cool Products** - [[Difftastic]] - [[abstraction]] - [[Asynchronous]] - [[dirty bit]] - [[domain]] - [[Hot Key (or Celebrity) Problem]] - [[latency]] - [[little bit of snarkiness in discussing technology architecture]] - [[LlamaParse]] - [[marketecture]] - [[niche]] - [[obtuse language for marketecture]] - [[Overloaded Technical Terms]] - [[persisted]] - [[Shaw's Principle]] - [[Stateful]] - [[stateless]] - [[Synchronous]] - [[web scraping tools]] - **[[Algorithms]]** - **Application Development** - **[[00_SDLC]]** - **Coding** - **Coding with AI** - [[custom .cursorrules templates]] - [[Declarative]] - [[Formatting Best Practices]] - [[Imperative]] - **Key Concepts** - [[Garbage Collection]] - **[[Languages]]** - [[Containerization]] - **Organizational** - [[Analytics Engineering]] - [[Architect]] - [[Development Approaches]] - [[Development Team Roles]] - [[Product Management]] - **Products** - [[Docker]] - [[ELK Stack]] - [[Github]] - [[Jenkins]] - [[Kubernetes]] - [[Prometheus]] - [[Web Development]] - **[[Data Structures]]** - **Leetcode** - [[00_Leetcode Approach]] - [[Big O - Space Complexity]] - [[Big O - Time Complexity]] - [[General Tips]] - **Patterns** - [[Pattern1 - Sliding Window]] - [[Pattern2 - Two Pointers]] - [[Pattern3 - Fast and Slow Pointers]] - [[Pattern4 - Merge Intervals]] - [[Pattern5 - Cyclic Sort]] - [[Pattern6 - In-place Reversal of a LinkedList]] - [[Pattern7 - Tree-breadth First Search (BFS)]] - [[Pattern8 - Tree Depth First Search (DFS)]] - [[Pattern9 - Two Heaps]] - [[Pattern10 - Subsets]] - [[Pattern11 - Modify Binary Search]] - [[Pattern12 - Bitwise XOR]] - **Machine Learning and AI** - **00_Data Science Process** - [[01_Data Science Process]] - **1.0 Understand Business and Data** - **2.1 Exploratory Data Analysis (EDA)** - [[2.1 EDA]] - **[[2.2 Data Preprocessing]]** - **[[2.3 Feature Engineering]]** - **[[2.4 Feature Selection]]** - **2.5 Train Model** - [[attention mechanism]] - [[Embedding Models]] - [[ensemble model]] - [[fine-tuning a model]] - [[Gradient Accumulation]] - **[[Hyperparameter Tuning]]** - **NLP** - [[Natural Language Processing (NLP)]] - [[Text Data Feature Engineering]] - [[Retrieval Augmented Generation (RAG)]] - **[[Supervised Models]]** - [[transfer learning]] - [[Unsupervised Model]] - **[[Unsupervised Models]]** - **2.6 Model Evaluation** - [[GLUE Benchmark]] - [[ML Experimentation and Evaluation]] - **2.7 Productionization** - [[LLM Orchestration Frameworks]] - [[Model Hosting (Inference)]] - [[00_MachineLearning and AI Overview]] - [[Deep Learning]] - **[[GenAI]]** - [[Machine Learning (ML) and AI]] - **ML Products** - [[ChatGPT]] - [[Gradio]] - [[Hugging Face]] - [[Jupyter Notebooks]] - [[Ollama]] - [[Retrieval Augmented Generation (RAG) with Knowledge Graphs]] - **Statistics** - [[bootstrapping]] - [[Cross Entropy Loss]] - [[error rate]] - [[Evaluation Metrics]] - [[logits]] - [[loss function]] - [[Partial Dependence]] - [[PCA]] - [[Quantization]] - [[SMOTE]] - [[Softmax]] - [[Z-score]] - **Testing** - [[KPIs]] - [[vector search]] - **System Design Concepts** - **Coordination** - **Coordination Products** - [[Consul]] - [[etcd]] - [[cooridination service]] - **Cybersecurity** - [[Encryption]] - [[Firewall]] - [[SSL and TLS]] - **Data Movement** - [[00_Data Movement]] - **API** - **API Gateway Products** - [[Apigee]] - [[AWS API Gateway]] - [[Azure API Management]] - [[Google Cloud Endpoints]] - [[Kong]] - [[Tyk]] - [[API Gateway]] - [[APIs]] - **HTTP Commands** - [[GET]] - [[PUT]] - [[Pagination]] - [[Request-Response]] - [[REST]] - [[Batch Processing]] - [[Change Data Capture (CDC)]] - **[[Communication Protocols]]** - [[Data Lineage]] - [[Data Movement Architectures]] - [[Data Pipelines]] - [[Data Replication for Analytical Architectures]] - [[Data Replication]] - **[[Data Serialization]]** - [[ETL]] - [[Extract, Load, Transform (ELT)]] - [[Fivetran]] - **Messaging** - [[Complex Event Processing (CEP)]] - [[Event Aggregator]] - [[Event Bus]] - [[Event Sourcing]] - [[Event Storming]] - [[Java Message Service (JMS)]] - [[Log-based Message Broker]] - [[Message Broker]] - [[Message Queue]] - [[Traditional Style Message Broker]] - [[Object Relational Mappings (ORMs)]] - **Open File Formats** - [[Apache Parquet]] - [[Columnar Open File Formats]] - **Products** - [[Apache Flink]] - [[Apache ZooKeeper]] - **Messaging** - [[ActiveMQ]] - [[IBM WebSphere]] - [[Kafka]] - [[RabbitMQ]] - [[Short Polling]] - **Streaming** - [[Data Stream]] - [[Stream Analytics]] - [[Stream Data Joins]] - [[Stream Processing]] - **Streaming Products** - [[Apache Spark Streaming]] - [[Materialize]] - [[Time Windows]] - [[Webhook]] - [[Workflow Managers]] - **[[Data Platforms]]** - **Data Visualization** - [[Business Intelligence (BI)]] - **Products** - [[Grafana]] - [[Tableau]] - **[[Data Warehouses]]** - **Databases** - [[ACID]] - [[Append-Only Log]] - [[BASE]] - [[Batch Data Joins]] - [[Blob Storage]] - [[Causality]] - **[[Conflict Resolution]]** - [[counter batching]] - **[[Data Modeling]]** - [[Database Serializability]] - **Database Storage Engines** - [[Compaction]] - [[Global Secondary Index]] - [[heap file]] - **Indexing** - [[B-Tree Index]] - [[B+ Tree Index]] - [[Clustered Index]] - [[Composite Index]] - [[Covering Index]] - [[Database Index]] - [[Hash Index]] - [[Log Structured Merge (LSM) Trees Index]] - [[Partial Index]] - [[R-trees Index]] - [[Leveled Compaction]] - [[Local Secondary Index]] - [[Log-structured Storage Engines]] - [[Page-Oriented Storage Engines]] - [[Size-tiered Compaction]] - **Storage Engine Products** - [[InnoDB]] - [[LevelDB]] - [[RocksDB]] - [[Database Transactions]] - [[heterogeneous data]] - [[homogenous data]] - **[[Isolation]]** - [[Last Write Wins (LWW)]] - [[MemTable]] - [[NoSQL Database]] - **NoSQL Databases** - **[[Types of NoSQL Databases]]** - [[Object Storage]] - [[Partitioning]] - **Products** - [[Apache Lucene]] - **[[Blob Storage Products]]** - **Metrics DB Products** - [[DataJunction]] - **MPP Databases** - [[Impala]] - **[[NoSQL DB Products]]** - **Plug-ins** - [[PG Vector]] - [[pgvector]] - **[[Relational DB Products]]** - [[Spanner]] - [[Supabase]] - **[[Query Engines]]** - **Query Languages** - [[Apache Spark]] - **Graph** - [[Cypher]] - [[Datalog]] - [[Gremlin]] - [[SPARQL]] - [[SQL]] - [[Relational Databases]] - **Replication** - [[Active-active Replication]] - [[Active-passive Replication]] - [[Asynchronous Data Replication]] - [[Database Replication]] - [[Leaderless Replication]] - [[Logical (row-based) log replication]] - [[Multi-leader Replication]] - [[Single Leader-Follower Replication Model]] - [[Single-leader Replication]] - [[Statement-based Replication]] - [[Synchronous Data Replication]] - [[Write Ahead Log (WAL) Shipping Replication]] - [[SQL joins]] - [[SSTable]] - [[Storage Types]] - [[Stored Procedure]] - [[Three-phase Commit (3PC)]] - [[Two-phase Commit (2PC)]] - **Frontend** - [[Cache-aside]] - [[Content Optimization]] - [[Frontend Performance Improvements]] - [[Write-around Cache]] - [[Write-back Cache]] - [[Write-through Cache]] - **Fundamentals** - [[Authentication]] - [[Authorization]] - **[[Caching]]** - [[CAP Theorem]] - **[[Compression]]** - [[Content Delivery Network (CDN)]] - [[Cookies]] - [[Databases]] - **[[Hashing]]** - [[idempotency]] - [[Load Balancer]] - [[Open Source]] - [[Open System Interconnection (OSI) Model]] - [[Rate Limiting]] - [[Websites]] - [[Write-ahead Log (WAL)]] - **Hardware and Networking** - [[Active-Passive Load Balancer]] - [[Anycast Load Balancing]] - [[Binary]] - [[Character Encodings]] - **Cloud** - **AWS** - [[AWS Architecture]] - **Reference Architectures** - [[AWS Cloud Resume Reference Architecture]] - [[Cloud Computing]] - [[Computer Server]] - [[Coordination]] - [[Data Centers]] - [[Domain Name System (DNS)]] - [[Elastic Load Balancer]] - **[[File Storage]]** - [[Global Networking]] - [[Global Server Load Balancing (GSLB)]] - [[Graphics Processing Units (GPU)]] - [[Hard Disk Drive (HDD)]] - [[Horizontal Scaling]] - [[Layer 3 - Network]] - [[Layer 4 - Transport]] - [[Layer 4 Load Balancer]] - [[Layer 7 - Application]] - [[Layer 7 Load Balancer]] - [[Local Networking]] - [[Memory]] - **Operating Systems** - **OS Products** - [[Linux]] - [[Operating Systems (OS)]] - [[Random Access Memory (RAM)]] - [[Server Software]] - [[Tensor Processing Units (TPU)]] - [[URL]] - [[Vertical Scaling]] - [[Virtual Machines]] - **[[Locking]]** - **System Design Patterns** - [[Async Job Worker Pool]] - [[Contending Updates]] - **Data Architecture Patterns** - [[Lambda Architecture]] - [[Medallion (or Multi-Hop) Architecture]] - [[Durable Job Processing]] - [[Event-Driven Architecture (EDA)]] - [[Fanout-on-read and Fanout-on-write]] - [[Microservices Architecture]] - [[Mobile Applications Architecture]] - **Principles** - [[Pub-sub]] - [[Recommendation Systems]] - [[Service-oriented Architecture (SOA)]] - [[Simple DB-Backed CRUD Service with Caching]] - [[Two Stage Architectures]] - **z_Images** %% End Waypoint %%