• 내용으로 건너뛰기

Out of the Box

사용자 도구

  • 로그인

사이트 도구

  • 최근 바뀜
  • 미디어 관리자
  • 사이트맵
추적: • 2021-10_replay-guided_adversarial_environment_design

tag:google

역링크

현재 문서를 가리키는 링크가 있는 문서 목록입니다.

  • 2020-12_monte_carlo_transformer_stochastic_self_attention_model_sequence_prediction
  • attention_all_need
  • bert_pre_training_deep_bidirectional_transformers_language_understanding
  • deep_reinforcement_learning_amidst_lifelong_non_stationarity
  • hyperbolic_discounting_learning_over_multiple_horizons
  • neuroevolution_self_interpretable_agents
  • online_hyper_parameter_tuning_off_policy_learning_via_evolutionary_strategies
  • review:2015-11_policy_distillation
  • review:2018-03_world_models
  • review:2018-11_qt-opt_scalable_deep_reinforcement_learning_for_vision-based_robotic_manipulation
  • review:2019-04_evolving_rewards_to_automate_reinforcement_learning
  • review:2020-06_rigging_the_lottery_making_all_tickets_winners
  • review:2020-10_massively_large_scale_distributed_reinforcement_learning_menger
  • review:2020-11_training_efficientnets_at_supercomputer_scale_83_imagenet_top_1_accuracy_one_hour
  • review:2021-01_brax_differentiable_physics_engine_large_scale_rigid_body_simulation
  • review:2021-02_paired_emergent_complexity_and_zero-shot_transfer_via_unsupervised_environment_design
  • review:2021-03_pay_attention_to_mlps
  • review:2021-04_efficientnetv2_smaller_models_and_faster_training
  • review:2021-06_extracting_training_data_from_large_language_models
  • review:2021-07_reasoning-modulated_representations
  • review:2021-07_scalable_evaluation_of_multi-agent_reinforcement_learning_with_melting_pot
  • review:2022-05_simplex_neural_population_learning_any-mixture_bayes-optimality_in_symmetric_zero-sum_games
  • review:2023-03_scaling_instructable_agents_across_many_simulated_worlds
  • review:2023-05_deep_reinforcement_learning_with_plasticity_injection
  • review:2023-10_a_general_theoretical_paradigm_to_understand_learning_from_human_preferences
  • review:2023-12_diloco_distributed_low-communication_training_of_language_models
  • review:2024-01_asynchronous_local-sgd_training_for_language_modeling
  • review:2024-01_a_minimaximalist_approach_to_reinforcement_learning_from_human_feedback
  • review:2024-01_parrot_pareto-optimal_multi-reward_reinforcement_learning_framework_for_text-to-image_generation
  • review:2024-01_towards_conversational_diagnostic_ai
  • review:2024-01_warm_on_the_benefits_of_weight_averaged_reward_models
  • review:2024-02_genie_generative_interactive_environments
  • review:2024-03_dipaco_distributed_path_composition
  • review:2024-03_gemma_open_models_based_on_gemini_research_and_technology
  • review:2024-03_stop_regressing_training_value_functions_via_classification_for_scalable_deep_rl
  • review:2025-01_streaming_diloco_with_overlapping_communication_towards_a_distributed_free_lunch
  • review:big_bird_transformers_longer_sequences
  • review:paired_a_new_multi-agent_approach_for_adversarial_environment_generation
  • revisiting_rainbow_promoting_more_insightful_inclusive_deep_reinforcement_learning_research
  • revisiting_small_batch_training_deep_neural_networks
  • strategies_structuring_story_generation
  • value_function_polytope_reinforcement_learning

문서 도구

  • 문서 보기
  • 이전 판
  • 역링크
  • Fold/unfold all
  • 맨 위로
별도로 명시하지 않을 경우, 이 위키의 내용은 다음 라이선스에 따라 사용할 수 있습니다: CC Attribution-Noncommercial-Share Alike 4.0 International
CC Attribution-Noncommercial-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki