Lecture

Simplified Recurrent Neural Network Structure, GRU

GRU (Gated Recurrent Unit) is a structure created to solve the limitations of RNNs. It offers similar functionality to LSTM but as a simplified recurrent neural network.

GRU alleviates the long-term dependency issue by remembering important information while discarding unnecessary data.


Why was GRU developed?

LSTM has the advantage of retaining long-term information, but its complex structure can lead to slower learning speeds.

GRU was designed to maintain the performance of LSTM while using a simpler structure for faster learning.

Like LSTM, GRU uses gates but has fewer gates with simpler calculations.

This makes it easy to implement and also quicker to train.


Key Structure of GRU

GRU is composed of the following two gates:

  • Update Gate: Decides how much of the past information to retain. It regulates the amount of information to remember.

  • Reset Gate: Determines how much of the past information should be ignored. It controls how much of the previous state to combine with the current input.

These two gates work together to maintain important information and discard unnecessary information. Consequently, GRU can effectively process information in a sequence over time.


How does GRU operate?

GRU functions through the following process:

  1. It calculates both the update gate and reset gate based on the current input and previous state.

  2. The reset gate determines the extent to which past information is reflected.

  3. The update gate decides how much the new state should be reflected.

  4. Finally, it calculates the new state and passes it to the next time step.

In this way, GRU can sequentially process information like an RNN, while effectively remembering even older information with fewer calculations.


In the next lesson, we will explore the Transformer structure, which is often compared with recurrent neural network-based models.

Mission
0 / 1

GRU가 LSTM에 비해 가지는 주된 장점은 무엇인가요?

더 많은 게이트를 사용하여 정확도가 높다

구조가 더 단순하여 학습 속도가 빠르다

더 복잡한 연산을 수행할 수 있다

더 많은 데이터를 처리할 수 있다

Lecture

AI Tutor

Design

Upload

Notes

Favorites

Help