Blog
데이터 과학, 인공지능, 딥러닝에 관한 이야기
77개의 포스트
![[논문 리뷰] Aligning machine and human visual representations across abstraction levels](/assets/images/blog/20260103-paper-url-pdf-aligning-machine-and-human-vis.jpg)
[논문 리뷰] Aligning machine and human visual representations across abstraction levels
...
![[논문 리뷰] Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration](/assets/images/blog/20260103-paper-2512-22382-completed-hyperparameter-trans.jpg)
[논문 리뷰] Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration
Hyperparameter tuning can dramatically impact training stability and final performance of large-scale models. Recent works on neural network parameterisations, such as $μ$P, have enabled transfer of o...
![[논문 리뷰] Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks](/assets/images/blog/20260103-paper-2512-22255-shape-of-thought-when-distribu.jpg)
[논문 리뷰] Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks
We present the surprising finding that a language model's reasoning capabilities can be improved by training on synthetic datasets of chain-of-thought (CoT) traces from more capable models, even when ...
![[논문 리뷰] SemanticGen: Video Generation in Semantic Space](/assets/images/blog/20260103-paper-2512-20619-semanticgen-video-generation-i.jpg)
[논문 리뷰] SemanticGen: Video Generation in Semantic Space
State-of-the-art video generative models typically learn the distribution of video latents in the VAE space and map them to pixels using a VAE decoder. While this approach can generate high-quality vi...
![[논문 리뷰] The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding](/assets/images/blog/20260103-paper-2512-19693-the-prism-hypothesis-harmonizi.jpg)
[논문 리뷰] The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding
Deep representations across modalities are inherently intertwined. In this paper, we systematically analyze the spectral characteristics of various semantic and pixel encoders. Interestingly, our stud...
![[논문 리뷰] Epistemological Fault Lines Between Human and Artificial Intelligence](/assets/images/blog/20260103-paper-2512-19466-epistemological-fault-lines-be.jpg)
[논문 리뷰] Epistemological Fault Lines Between Human and Artificial Intelligence
Large language models (LLMs) are widely described as artificial intelligence, yet their epistemic profile diverges sharply from human cognition. Here we show that the apparent alignment between human ...
![[논문 리뷰] Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement](/assets/images/blog/20260103-paper-2512-18950-learning-hierarchical-procedur.jpg)
[논문 리뷰] Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement
We present MACLA, a framework that decouples reasoning from learning by maintaining a frozen large language model while performing all adaptation in an external hierarchical procedural memory. MACLA e...
![[논문 리뷰] Sophia: A Persistent Agent Framework of Artificial Life](/assets/images/blog/20260103-paper-2512-18202-sophia-a-persistent-agent-fram.jpg)
[논문 리뷰] Sophia: A Persistent Agent Framework of Artificial Life
The development of LLMs has elevated AI agents from task-specific tools to long-lived, decision-making entities. Yet, most architectures remain static and reactive, tethered to manually defined, narro...
![[논문 리뷰] Distributional AGI Safety](/assets/images/blog/20260103-paper-2512-16856-distributional-agi-safety.jpg)
[논문 리뷰] Distributional AGI Safety
AI safety and alignment research has predominantly been focused on methods for safeguarding individual AI systems, resting on the assumption of an eventual emergence of a monolithic Artificial General...
![[논문 리뷰] LLaDA2.0: Scaling Up Diffusion Language Models to 100B](/assets/images/blog/20260103-paper-2512-15745-llada2-0-scaling-up-diffusion-.jpg)
[논문 리뷰] LLaDA2.0: Scaling Up Diffusion Language Models to 100B
This paper presents LLaDA2.0 -- a tuple of discrete diffusion large language models (dLLM) scaling up to 100B total parameters through systematic conversion from auto-regressive (AR) models -- establi...
![[논문 리뷰] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding](/assets/images/blog/20260103-paper-2512-13586-refusion-a-diffusion-large-lan.jpg)
[논문 리뷰] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Autoregressive models (ARMs) are hindered by slow sequential inference. While masked diffusion models (MDMs) offer a parallel alternative, they suffer from critical drawbacks: high computational overh...
![[논문 리뷰] Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model](/assets/images/blog/20260103-paper-2512-13507-seedance-1-5-pro-a-native-audi.jpg)
[논문 리뷰] Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model
Recent strides in video generation have paved the way for unified audio-visual generation. In this work, we present Seedance 1.5 pro, a foundational model engineered specifically for native, joint aud...
![[논문 리뷰] An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges](/assets/images/blog/20260103-paper-2512-11362-an-anatomy-of-vision-language-.jpg)
[논문 리뷰] An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges
Vision-Language-Action (VLA) models are driving a revolution in robotics, enabling machines to understand instructions and interact with the physical world. This field is exploding with new models and...
![[논문 리뷰] Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer](/assets/images/blog/20260103-paper-2511-22699-z-image-an-efficient-image-gen.jpg)
[논문 리뷰] Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
The landscape of high-performance image generation models is currently dominated by proprietary systems, such as Nano Banana Pro and Seedream 4.0. Leading open-source alternatives, including Qwen-Imag...
![[논문 리뷰] ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration](/assets/images/blog/20260103-paper-2511-21689-toolorchestra-elevating-intell.jpg)
[논문 리뷰] ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally expensiv...
![[논문 리뷰] Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework](/assets/images/blog/20260103-paper-2511-21686-matrix-peer-to-peer-multi-agen.jpg)
[논문 리뷰] Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework
Synthetic data has become increasingly important for training large language models, especially when real data is scarce, expensive, or privacy-sensitive. Many such generation tasks require coordinate...
![[논문 리뷰] Flow Map Distillation Without Data](/assets/images/blog/20260103-paper-2511-19428-flow-map-distillation-without-.jpg)
[논문 리뷰] Flow Map Distillation Without Data
State-of-the-art flow models achieve remarkable quality but require slow, iterative sampling. To accelerate this, flow maps can be distilled from pre-trained teachers, a procedure that conventionally ...
![[논문 리뷰] Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens](/assets/images/blog/20260103-paper-2511-19418-chain-of-visual-thought-teachi.jpg)
[논문 리뷰] Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Vision-Language Models (VLMs) excel at reasoning in linguistic space but struggle with perceptual understanding that requires dense visual perception, e.g., spatial reasoning and geometric awareness. ...
![[논문 리뷰] CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning](/assets/images/blog/20260103-paper-2511-18659-clara-bridging-retrieval-and-g.jpg)
[논문 리뷰] CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning
Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge but still suffers from long contexts and disjoint retrieval-generation optimization. In this work, we...
![[논문 리뷰] OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists](/assets/images/blog/20260103-paper-2511-16931-omniscientist-toward-a-co-evol.jpg)
[논문 리뷰] OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
With the rapid development of Large Language Models (LLMs), AI agents have demonstrated increasing proficiency in scientific tasks, ranging from hypothesis generation and experimental design to manusc...
![[논문 리뷰] Evolution Strategies at the Hyperscale](/assets/images/blog/20260103-paper-2511-16652-evolution-strategies-at-the-hy.jpg)
[논문 리뷰] Evolution Strategies at the Hyperscale
We introduce Evolution Guided General Optimization via Low-rank Learning (EGGROLL), an evolution strategies (ES) algorithm designed to scale backprop-free optimization to large population sizes for mo...
![[논문 리뷰] A Primer on Quantum Machine Learning](/assets/images/blog/20260103-paper-2511-15969-a-primer-on-quantum-machine-le.jpg)
[논문 리뷰] A Primer on Quantum Machine Learning
Quantum machine learning (QML) is a computational paradigm that seeks to apply quantum-mechanical resources to solve learning problems. As such, the goal of this framework is to leverage quantum proce...
![[논문 리뷰] Fine-Tuned LLMs Know They Don't Know: A Parameter-Efficient Approach to Recovering Honesty](/assets/images/blog/20260103-paper-2511-12991-fine-tuned-llms-know-they-don-.jpg)
[논문 리뷰] Fine-Tuned LLMs Know They Don't Know: A Parameter-Efficient Approach to Recovering Honesty
The honesty of Large Language Models (LLMs) is increasingly important for safe deployment in high-stakes domains. However, this crucial trait is severely undermined by supervised fine-tuning (SFT), a ...
![[논문 리뷰] Black-Box On-Policy Distillation of Large Language Models](/assets/images/blog/20260103-paper-2511-10643-black-box-on-policy-distillati.jpg)
[논문 리뷰] Black-Box On-Policy Distillation of Large Language Models
Black-box distillation creates student large language models (LLMs) by learning from a proprietary teacher model's text outputs alone, without access to its internal logits or parameters. In this work...
![[논문 리뷰] DoPE: Denoising Rotary Position Embedding](/assets/images/blog/20260103-paper-2511-09146-dope-denoising-rotary-position.jpg)
[논문 리뷰] DoPE: Denoising Rotary Position Embedding
Rotary Position Embedding (RoPE) in Transformer models has inherent limits that weaken length extrapolation. We reinterpret the attention map with positional encoding as a noisy feature map, and propo...
![[논문 리뷰] Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models](/assets/images/blog/20260103-paper-2511-08577-think-at-hard-selective-latent.jpg)
[논문 리뷰] Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
Improving reasoning capabilities of Large Language Models (LLMs), especially under parameter constraints, is crucial for real-world applications. Prior work proposes recurrent transformers, which allo...
![[논문 리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B](/assets/images/blog/20260103-paper-2511-06221-tiny-model-big-logic-diversity.jpg)
[논문 리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Challenging the prevailing consensus that small models inherently lack robust reasoning, this report introduces VibeThinker-1.5B, a 1.5B-parameter dense model developed via our Spectrum-to-Signal Prin...
![[논문 리뷰] World Simulation with Video Foundation Models for Physical AI](/assets/images/blog/20260103-paper-2511-00062-world-simulation-with-video-fo.jpg)
[논문 리뷰] World Simulation with Video Foundation Models for Physical AI
We introduce [Cosmos-Predict2.5], the latest generation of the Cosmos World Foundation Models for Physical AI. Built on a flow-based architecture, [Cosmos-Predict2.5] unifies Text2World, Image2World, ...
![[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models](/assets/images/blog/20260103-paper-2510-26658-the-era-of-agentic-organizatio.jpg)
[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models
We envision a new era of AI, termed agentic organization, where agents solve complex problems by working collaboratively and concurrently, enabling outcomes beyond individual intelligence. To realize ...
![[논문 리뷰] Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision](/assets/images/blog/20260103-paper-2509-14234-compute-as-teacher-turning-inf.jpg)
[논문 리뷰] Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision
Where do learning signals come from when there is no ground truth in post-training? We propose turning exploration into supervision through Compute as Teacher (CaT), which converts the model's own exp...
![[논문 리뷰] Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings](/assets/images/blog/20260103-paper-2509-10534-decoupling-the-what-and-where-.jpg)
[논문 리뷰] Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings
The attention mechanism in a Transformer architecture matches key to query based on both content -- the what -- and position in a sequence -- the where. We present an analysis indicating that what and...
![[논문 리뷰] Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning](/assets/images/blog/20260103-paper-2509-03646-emergent-hierarchical-reasonin.jpg)
[논문 리뷰] Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning
Reinforcement Learning (RL) has proven highly effective at enhancing the complex reasoning abilities of Large Language Models (LLMs), yet underlying mechanisms driving this success remain largely opaq...
![[논문 리뷰] From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery](/assets/images/blog/20260103-paper-2508-14111-from-ai-for-science-to-agentic.jpg)
[논문 리뷰] From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
Artificial intelligence (AI) is reshaping scientific discovery, evolving from specialized computational tools into autonomous research partners. We position Agentic Science as a pivotal stage within t...
![[논문 리뷰] Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2](/assets/images/blog/20260103-paper-2502-03544-gold-medalist-performance-in-s.jpg)
[논문 리뷰] Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
We present AlphaGeometry2 (AG2), a significantly improved version of AlphaGeometry introduced in (Trinh et al., 2024), which has now surpassed an average gold medalist in solving Olympiad geometry pro...
![[논문 리뷰] Nested Learning: The Illusion of Deep Learning Architecture](/assets/images/blog/20260102-paper-url-pdf-nested-learning-the-illusion-o.jpg)
[논문 리뷰] Nested Learning: The Illusion of Deep Learning Architecture
Over the last decades, developing more powerful neural architectures and simultaneously designing optimization algorithms to effectively train them have been the core of research efforts to enhance th...
![[논문 리뷰] Act2Goal: From World Model To General Goal-conditioned Policy](/assets/images/blog/20260102-paper-2512-23541-act2goal-from-world-model-to-g.jpg)
[논문 리뷰] Act2Goal: From World Model To General Goal-conditioned Policy
Specifying robotic manipulation tasks in a manner that is both expressive and precise remains a central challenge. While visual goals provide a compact and unambiguous task specification, existing goa...
![[논문 리뷰] Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks](/assets/images/blog/20260102-paper-2512-22106-pruning-as-a-game-equilibrium-.jpg)
[논문 리뷰] Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks
Neural network pruning is widely used to reduce model size and computational cost. Yet, most existing methods treat sparsity as an externally imposed constraint, enforced through heuristic importance ...
![[논문 리뷰] AgentEvolver: Towards Efficient Self-Evolving Agent System](/assets/images/blog/20260102-paper-2511-10395-agentevolver-towards-efficient.jpg)
[논문 리뷰] AgentEvolver: Towards Efficient Self-Evolving Agent System
Autonomous agents powered by large language models (LLMs) have the potential to significantly enhance human productivity by reasoning, using tools, and executing complex tasks in diverse environments....
![[논문 리뷰] TiDAR: Think in Diffusion, Talk in Autoregression](/assets/images/blog/20260102-paper-2511-08923-tidar-think-in-diffusion-talk-.jpg)
[논문 리뷰] TiDAR: Think in Diffusion, Talk in Autoregression
Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language model...
![[논문 리뷰] AlphaResearch: Accelerating New Algorithm Discovery with Language Models](/assets/images/blog/20260102-paper-2511-08522-alpharesearch-accelerating-new.jpg)
[논문 리뷰] AlphaResearch: Accelerating New Algorithm Discovery with Language Models
Large language models have made significant progress in complex but easy-to-verify problems, yet they still struggle with discovering the unknown. In this paper, we present extbf{AlphaResearch}, an ...
![[논문 리뷰] Attention and Compression is all you need for Controllably Efficient Language Models](/assets/images/blog/20260102-paper-2511-05313-attention-and-compression-is-a.jpg)
[논문 리뷰] Attention and Compression is all you need for Controllably Efficient Language Models
The quadratic cost of attention in transformers motivated the development of efficient approaches: namely sparse and sliding window attention, convolutions and linear attention. Although these approac...
![[논문 리뷰] The Curved Spacetime of Transformer Architectures](/assets/images/blog/20260102-paper-2511-03060-the-curved-spacetime-of-transf.jpg)
[논문 리뷰] The Curved Spacetime of Transformer Architectures
We present a geometric framework for understanding Transformer-based language models, drawing an explicit analogy to General Relativity. Queries and keys induce an effective metric on representation s...
![[논문 리뷰] No-Human in the Loop: Agentic Evaluation at Scale for Recommendation](/assets/images/blog/20260102-paper-2511-03051-no-human-in-the-loop-agentic-e.jpg)
[논문 리뷰] No-Human in the Loop: Agentic Evaluation at Scale for Recommendation
Evaluating large language models (LLMs) as judges is increasingly critical for building scalable and trustworthy evaluation pipelines. We present ScalingEval, a large-scale benchmarking study that sys...
![[논문 리뷰] ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning](/assets/images/blog/20260102-paper-2510-27492-thinkmorph-emergent-properties.jpg)
[논문 리뷰] ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Multimodal reasoning requires iterative coordination between language and vision, yet it remains unclear what constitutes a meaningful interleaved chain of thought. We posit that text and image though...
![[논문 리뷰] Context Engineering 2.0: The Context of Context Engineering](/assets/images/blog/20260102-paper-2510-26493-context-engineering-2-0-the-co.jpg)
[논문 리뷰] Context Engineering 2.0: The Context of Context Engineering
Karl Marx once wrote that ``the human essence is the ensemble of social relations'', suggesting that individuals are not isolated entities but are fundamentally shaped by their interactions with other...
![[논문 리뷰] Chain-of-Thought Hijacking](/assets/images/blog/20260102-paper-2510-26418-chain-of-thought-hijacking.jpg)
[논문 리뷰] Chain-of-Thought Hijacking
Large reasoning models (LRMs) achieve higher task performance with more inference-time computation, and prior works suggest this scaled reasoning may also strengthen safety by improving refusal. Yet w...
![[논문 리뷰] GAP: Graph-Based Agent Planning with Parallel Tool Use and Reinforcement Learning](/assets/images/blog/20260102-paper-2510-25320-gap-graph-based-agent-planning.jpg)
[논문 리뷰] GAP: Graph-Based Agent Planning with Parallel Tool Use and Reinforcement Learning
Autonomous agents powered by large language models (LLMs) have shown impressive capabilities in tool manipulation for complex task-solving. However, existing paradigms such as ReAct rely on sequential...
![[논문 리뷰] Reasoning with Sampling: Your Base Model is Smarter Than You Think](/assets/images/blog/20260102-paper-2510-14901-reasoning-with-sampling-your-b.jpg)
[논문 리뷰] Reasoning with Sampling: Your Base Model is Smarter Than You Think
Frontier reasoning models have exhibited incredible capabilities across a wide array of disciplines, driven by posttraining large language models (LLMs) with reinforcement learning (RL). However, desp...
![[논문 리뷰] Cache-to-Cache: Direct Semantic Communication Between Large Language Models](/assets/images/blog/20260102-paper-2510-03215-cache-to-cache-direct-semantic.jpg)
[논문 리뷰] Cache-to-Cache: Direct Semantic Communication Between Large Language Models
Multi-LLM systems harness the complementary strengths of diverse Large Language Models, achieving performance and efficiency gains unattainable by a single model. In existing designs, LLMs communicate...
![[논문 리뷰] Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks](/assets/images/blog/20260102-paper-2503-09572-plan-and-act-improving-plannin.jpg)
[논문 리뷰] Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
Large language models (LLMs) have shown remarkable advancements in enabling language agents to tackle simple tasks. However, applying them for complex, multi-step, long-horizon tasks remains a challen...

테스트의 중요성과 구현 방법
소프트웨어 개발 과정에서 "테스트"라는 단어를 듣지 않은 개발자는 아마 없을 것입니다. 그만큼 테스트는 소프트웨어의 신뢰성과 안정성을 보장하는 데 필수적인 역할을 합니다. 특히 데이터 과학과 인공지능 분야에서는 결과의 정확성과 모델의 성능을 검증하기 위해 테스트가 더욱 중요합니다. 이번 포스트에서는 테스트의 중요성을 살펴보고, 테스트를 어떻게 효율적으로 구...
![[논문 리뷰] mHC: Manifold-Constrained Hyper-Connections](/assets/images/blog/20260101-paper-2512-24880-mhc-manifold-constrained-hyper.jpg)
[논문 리뷰] mHC: Manifold-Constrained Hyper-Connections
Recently, studies exemplified by Hyper-Connections (HC) have extended the ubiquitous residual connection paradigm established over the past decade by expanding the residual stream width and diversifyi...
![[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models](/assets/images/blog/20260101-paper-2510-26658-the-era-of-agentic-organizatio.jpg)
[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models
We envision a new era of AI, termed agentic organization, where agents solve complex problems by working collaboratively and concurrently, enabling outcomes beyond individual intelligence. To realize ...
![[논문 리뷰] Real Deep Research for AI, Robotics and Beyond](/assets/images/blog/20260101-paper-2510-20809-real-deep-research-for-ai-robo.jpg)
[논문 리뷰] Real Deep Research for AI, Robotics and Beyond
With the rapid growth of research in AI and robotics now producing over 10,000 papers annually it has become increasingly difficult for researchers to stay up to date. Fast evolving trends, the rise o...
![[논문 리뷰] The Free Transformer](/assets/images/blog/20260101-paper-2510-17558-the-free-transformer.jpg)
[논문 리뷰] The Free Transformer
We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experiment...
![[논문 리뷰] Training-Free Group Relative Policy Optimization](/assets/images/blog/20260101-paper-2510-08191-training-free-group-relative-p.jpg)
[논문 리뷰] Training-Free Group Relative Policy Optimization
Recent advances in Large Language Model (LLM) agents have demonstrated their promising general capabilities. However, their performance in specialized real-world domains often degrades due to challeng...
![[논문 리뷰] ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory](/assets/images/blog/20260101-paper-2509-25140-reasoningbank-scaling-agent-se.jpg)
[논문 리뷰] ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
With the growing adoption of large language model agents in persistent real-world roles, they naturally encounter continuous streams of tasks. A key limitation, however, is their failure to learn from...
![[논문 리뷰] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality](/assets/images/blog/20260101-paper-2405-21060-transformers-are-ssms-generali.jpg)
[논문 리뷰] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
While Transformers have been the main architecture behind deep learning's success in language modeling, state-space models (SSMs) such as Mamba have recently been shown to match or outperform Transfor...
![[논문 리뷰] Mamba: Linear-Time Sequence Modeling with Selective State Spaces](/assets/images/blog/20260101-paper-2312-00752-mamba-linear-time-sequence-mod.jpg)
[논문 리뷰] Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Many subquadratic-time a...
![[논문 리뷰] Vector database management systems: Fundamental concepts, use-cases, and current challenges](/assets/images/blog/20260101-paper-2309-11322-vector-database-management-sys.jpg)
[논문 리뷰] Vector database management systems: Fundamental concepts, use-cases, and current challenges
Vector database management systems have emerged as an important component in modern data management, driven by the growing importance for the need to computationally describe rich data such as texts, ...

데이터 증강을 통한 모델 성능 향상 기법
인공지능(AI)와 머신러닝(ML) 분야에서 데이터는 가장 중요한 자산입니다. 충분한 양의 고품질 데이터를 확보하는 것은 모델의 성능을 결정짓는 중요한 요소입니다. 그러나 현실에서는 데이터가 부족하거나, 데이터 수집에 많은 비용과 시간이 소요되는 경우가 자주 발생합니다. 이러한 문제를 해결하기 위해 데이터 증강(Data Augmentation) 기법이 주목받...

RAG 시스템 구축: 검색 증강 생성의 원리와 구현
오늘날 인공지능(AI)이 발전하면서 자연어 처리(NLP) 분야에서도 다양한 혁신이 일어나고 있습니다. 그 중 하나가 바로 RAG(Retrieval-Augmented Generation, 검색 증강 생성) 시스템입니다. RAG는 검색과 생성 두 가지 프로세스를 결합하여 보다 정확하고 풍부한 정보를 제공하는 데 중점을 둡니다. 전통적인 NLP 모델이 대규모 데...

PyTorch 텐서 기초
딥러닝(Deep Learning)은 현대 인공지능(AI) 기술의 중심에 서 있습니다. 이 중에서도 PyTorch는 연구자와 개발자들 사이에서 인기가 높은 프레임워크로, 그 유연성과 직관적인 인터페이스 덕분에 널리 사용되고 있습니다. PyTorch를 제대로 이해하기 위해서는 텐서(Tensor)라는 기본 단위를 확실히 이해하는 것이 중요합니다. 텐서는 데이터를...
![[논문 리뷰] QLoRA: Efficient Finetuning of Quantized LLMs](/assets/images/blog/20251231-paper-2305-14314-qlora-efficient-finetuning-of-.jpg)
[논문 리뷰] QLoRA: Efficient Finetuning of Quantized LLMs
We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLo...
![[논문 리뷰] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](/assets/images/blog/20251231-paper-1810-04805-bert-pre-training-of-deep-bidi.jpg)
[논문 리뷰] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed t...

MLOps 입문: 머신러닝 운영 파이프라인 구축
머신러닝(ML)은 오늘날 많은 산업에서 혁신을 주도하고 있습니다. 그러나 머신러닝 모델을 성공적으로 개발하는 것만으로는 충분하지 않습니다. 모델을 실제 운영 환경에 배포하고 모니터링하며 지속적으로 개선하기 위해서는 특별한 노력이 필요합니다. 이 과정에서 MLOps가 중요한 역할을 합니다....

LLM 파인튜닝: LoRA와 QLoRA 기법
대규모 언어 모델(LLM, Large Language Models)은 자연어 처리(NLP, Natural Language Processing) 분야에서 혁신을 이끌어왔습니다. 이러한 모델들은 방대한 양의 데이터를 활용하여 다양한 언어적 과제를 수행할 수 있습니다. 하지만 LLM은 막대한 컴퓨팅 자원을 요구하며, 특정 작업에 맞게 모델을 조정하는 파인튜닝 과...

Diffusion 모델: 이미지 생성 AI의 원리와 활용
인공지능(AI) 기술은 최근 몇 년 동안 급격히 발전해 왔으며, 특히 이미지 생성 분야에서 큰 주목을 받고 있습니다. 이러한 발전의 중심에는 'Diffusion 모델'이라는 강력한 기술이 자리잡고 있습니다. Diffusion 모델은 복잡한 패턴을 학습하고, 현실감 넘치는 이미지를 생성하는 데 탁월한 성능을 보이며, 다양한 산업 분야에서 응용되고 있습니다....

딥러닝 기초
딥러닝(Deep Learning)은 인공지능(AI) 분야에서 가장 주목받는 기술 중 하나로, 여러 분야에서 혁신적인 변화를 이끌고 있습니다. 이미지를 인식하거나 자연어를 처리하는 것과 같은 복잡한 문제를 해결하는 데 있어 딥러닝의 역할은 날이 갈수록 커지고 있습니다. 이 글에서는 딥러닝의 기본 개념을 이해하고, 실제로 어떻게 구현할 수 있는지를 살펴보겠습니...

자연어처리에서의 Word Embedding
자연어처리(NLP, Natural Language Processing) 분야에서 Word Embedding은 필수적인 개념 중 하나입니다. 이는 컴퓨터가 인간의 언어를 이해하고 처리할 수 있도록 돕는 중요한 기술입니다. 이번 블로그 포스트에서는 Word Embedding의 기본 개념과 주요 기법들을 살펴보고, Python 코드를 통해 이를 실습해보는 시간을...

트랜스포머 Attention 메커니즘의 이해
최근 몇 년간 자연어 처리(Natural Language Processing, NLP) 분야에서는 혁신적인 변화가 일어났습니다. 그 중심에는 단연 트랜스포머(Transformer) 모델이 자리 잡고 있습니다. 트랜스포머는 다양한 NLP 작업에서 뛰어난 성능을 보이며, 언어 모델, 번역, 요약, 질의응답 등 여러 응용 분야에서 사용됩니다. 이러한 트랜스포머의...

강화학습 완벽 가이드: 이론부터 실전까지
강화학습(Reinforcement Learning)은 인공지능(AI) 분야에서 기계가 스스로 학습하고 결정할 수 있는 능력을 부여하는 중요한 기술입니다. 이 방법론은 로봇 제어, 게임 플레이, 자율 주행 자동차 등 다양한 분야에서 혁신적인 결과를 보여주고 있습니다. 본 가이드에서는 강화학습의 역사, 이론적 배경, 주요 알고리즘, 그리고 실제 구현까지 심층적으로 다룹니다.
![[논문 리뷰] Attention Is All You Need](/assets/images/blog/20251230-paper-1706-03762-attention-is-all-you-need.jpg)
[논문 리뷰] Attention Is All You Need
Transformer 아키텍처를 최초로 제안한 획기적인 논문. RNN과 CNN을 완전히 배제하고 Attention 메커니즘만으로 시퀀스 모델링의 새로운 패러다임을 제시하여, BERT, GPT 등 현대 자연어처리의 기반을 마련했습니다.

Graph Neural Networks 기초
최근 들어 인공지능(AI)과 머신러닝(ML)이 다양한 분야에서 혁신을 이루고 있습니다. 그중에서도 그래프 뉴럴 네트워크(Graph Neural Networks, GNN)는 복잡한 구조적 데이터를 효율적으로 처리할 수 있는 강력한 도구로 주목받고 있습니다. 그래프 데이터는 소셜 네트워크, 추천 시스템, 분자 구조 등 다양한 실세계 문제에서 자연스럽게 발생하며...

CNN 이미지 분류: 딥러닝의 핵심 기술
이미지 분류는 컴퓨터 비전 분야에서 가장 중요한 문제 중 하나로, 자율주행차의 장애물 인식, 의료 영상 분석, 소셜 미디어의 이미지 태그링 등 다양한 분야에 응용되고 있습니다. 특히, CNN(Convolutional Neural Networks, 합성곱 신경망)은 이미지 데이터 처리에서 탁월한 성능을 보여주며, 딥러닝 혁신의 중심에 서 있습니다. 이번 블로...

BERT 완벽 가이드: 자연어처리의 혁명적 모델
BERT(Bidirectional Encoder Representations from Transformers)는 2018년 구글이 발표한 혁신적인 자연어처리 모델입니다. 양방향 문맥 이해를 통해 NLP 분야에 새로운 패러다임을 제시한 BERT의 아키텍처, 학습 방법, 실전 활용법까지 상세히 알아봅니다.

SuanLab 블로그에 오신 것을 환영합니다
데이터 과학, 인공지능, 딥러닝에 관한 다양한 이야기를 공유하는 SuanLab 블로그입니다.
