Most of the new deep learning models being released, especially in NLP, are very, very large: They have parameters ranging from hundreds of...
Most of the new deep learning models being released, especially in NLP, are very, very large: They have parameters ranging from hundreds of millions to tens of billions. Given good enough architecture, the larger the model, the more learning capacity it has. Thus, these new models have huge learning capacity and are trained on very, very large datasets. Because of that, they learn the entire distribution of the datasets they are trained on. One can say that they encode compressed knowledge of these datasets. This allows these models to be used for very interesting applications—the most common one being transfer learning. Transfer learning is fine-tuning pre-trained models on…
This story continues at The Next Web
from The Next Web https://ift.tt/2zniZjP
via IFTTT
COMMENTS