Credit Assignment

  • Reference work entry
  • First Online: 01 January 2017
  • Cite this reference work entry

what is assignment credit

  • Claude Sammut 3  

280 Accesses

Structural credit assignment ; Temporal credit assignment

When a learning system employs a complex decision process, it must assign credit or blame for the outcomes to each of its decisions. Where it is not possible to directly attribute an individual outcome to each decision, it is necessary to apportion credit and blame between each of the combinations of decisions that contributed to the outcome. We distinguish two cases in the credit assignment problem. Temporal credit assignment refers to the assignment of credit for outcomes to actions. Structural credit assignment refers to the assignment of credit for actions to internal decisions. The first subproblem involves determining when the actions that deserve credit were taken and the second involves assigning credit to the internal structure of actions (Sutton  1984 ).

Consider the problem of learning to balance a pole that is hinged on a cart (Michie and Chambers  1968 ; Anderson and Miller  1991 ). The cart...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Albus JS (1975) A new approach to manipulator control: the cerebellar model articulation controller (CMAC). J Dyn Syst Measur Control Trans ASME 97(3):220–227

Article   MATH   Google Scholar  

Anderson CW, Miller WT (1991) A set of challenging control problems. In: Miller W, Sutton RS, Werbos PJ (eds) Neural networks for control. MIT Press, Cambridge

Google Scholar  

Atkeson C, Schaal S, Moore A (1997) Locally weighted learning. AI Rev 11:11–73

Banerjee B, Liu Y, Youngblood GM (eds) (2006) Proceedings of the ICML workshop on “structural knowledge transfer for machine learning, Pittsburgh

Barto A, Sutton R, Anderson C (1983) Neuron-like adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cybern SMC-13:834–846

Article   Google Scholar  

Benson S, Nilsson NJ (1995) Reacting, planning and learning in an autonomous agent. In: Furukawa K, Michie D, Muggleton S (eds) Machine intelligence, vol 14. Oxford University Press, Oxford

Bertsekas DP, Tsitsiklis J (1996) Neuro-dynamic programming. Athena Scientific, Nashua

MATH   Google Scholar  

Caruana R (1997) Multitask learning. Mach Learn 28:41–75

Dejong G, Mooney R (1986) Explanation-based learning: an alternative view. Mach Learn 1:145–176

Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley Longman Publishing, Boston

Grefenstette JJ (1988) Credit assignment in rule discovery systems based on genetic algorithms. Mach Learn 3(2–3):225–245

Hinton G, Rumelhart D, Williams R (1985) Learning internal representation by back-propagating errors. In: Rumelhart D, McClelland J, Group TPR (eds) Parallel distributed computing: explorations in the microstructure of cognition, vol 1. MIT Press, Cambridge, pp 31–362

Holland J (1986) Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning: an artificial intelligence approach, vol 2. Morgan Kaufmann, Los Altos

Laird JE, Newell A, Rosenbloom PS (1987) SOAR: an architecture for general intelligence. Artif Intell 33(1):1–64

Mahadevan S (2009) Learning representation and control in Markov decision processes: new frontiers. Found Trends Mach Learn 1(4):403–565

Michie D, Chambers R (1968) Boxes: an experiment in adaptive control. In: Dale E, Michie D (eds) Machine intelligence, vol 2. Oliver and Boyd, Edinburgh

Minsky M (1961) Steps towards artificial intelligence. Proc IRE 49:8–30

Article   MathSciNet   Google Scholar  

Mitchell TM, Keller RM, Kedar-Cabelli ST (1986) Explanation based generalisation: a unifying view. Mach Learn 1:47–80

Mitchell TM, Utgoff PE, Banerji RB (1983) Learning by experimentation: acquiring and refining problem-solving heuristics. In: Michalski R, Carbonell J, Mitchell T (eds) Machine kearning: an artificial intelligence approach. Tioga, Palo Alto

Moore AW (1990) Efficient memory-based learning for robot control. Ph.D. thesis, UCAM-CL-TR-209, Computer Laboratory, University of Cambridge, Cambridge

Niculescu-mizil A, Caruana R (2007) Inductive transfer for Bayesian network structure learning. In: Proceedings of the 11th international conference on AI and statistics (AISTATS 2007), San Juan

Reid MD (2004) Improving rule evaluation using multitask learning. In: Proceedings of the 14th international conference on inductive logic programming, Porto, pp 252–269

Reid MD (2007) DEFT guessing: using inductive transfer to improve rule evaluation from limited data. Ph.D. thesis, School of Computer Science and Engineering, The University of New South Wales, Sydney

Rosenblatt F (1962) Principles of neurodynamics: perceptrons and the theory of Brain mechanics. Spartan Books, Washington, DC

Samuel A (1959) Some studies in machine learning using the game of checkers. IBM J Res Develop 3(3):210–229

Silver D, Bakir G, Bennett K, Caruana R, Pontil M, Russell S et al (2005) NIPS workshop on “inductive transfer: 10 years later”, Whistler

Sutton R (1984) Temporal credit assignment in reinforcement learning. Ph.D. thesis, Department of Computer and Information Science, University of Massachusetts, Amherst

Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10:1633–1685

MathSciNet   MATH   Google Scholar  

Wang X, Simon HA, Lehman JF, Fisher DH (1996) Learning planning operators by observation and practice. In: Proceedings of the second international conference on AI planning systems (AIPS-94), Chicago, pp 335–340

Watkins C (1989) Learning with delayed rewards. Ph.D. thesis, Psychology Department, University of Cambridge, Cambridge

Watkins C, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292

Download references

Author information

Authors and affiliations.

The University of New South Wales, Sydney, NSW, Australia

Claude Sammut

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Faculty of Information Technology, Monash University, Melbourne, VIC, Australia

Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this entry

Cite this entry.

Sammut, C. (2017). Credit Assignment. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_185

Download citation

DOI : https://doi.org/10.1007/978-1-4899-7687-1_185

Published : 14 April 2017

Publisher Name : Springer, Boston, MA

Print ISBN : 978-1-4899-7685-7

Online ISBN : 978-1-4899-7687-1

eBook Packages : Computer Science Reference Module Computer Science and Engineering

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Skip to primary navigation
  • Skip to content
  • Skip to primary sidebar

Tim Dettmers

Making deep learning accessible.

Header Right

Blog posts topics.

  • PhD Life (3)
  • Deep Learning (7)
  • Hardware (8)
  • Neuroscience (1)

Main navigation

Credit assignment in deep learning.

2017-09-16 by Tim Dettmers 15 Comments

This morning I got an email about my blog post discussing the history of deep learning which rattled me back into a time of my academic career which I rather not think about. It was a low point which nearly ended my Master studies at the University of Lugano, and it made me feel so bad about blogging that I took two long years to recover. So what has happened?

When I started my masters, I worked on blog posts for NVIDIA which featured introductions into deep learning. Part of this blog post series also discusses the history of deep learning. I hence discussed what I thought to be the historical milestones with the largest impact but in doing so, I inadvertently assigned credit to researchers that I thought had a good impact on the field. I worked on this blog post and circulated it in my deep learning class’s forums to the dismay of my then advisor who holds the opposite view of mine.

To evaluate the credit that a research idea deserves, I believe that it is not only important who has the first idea, but I also believe that it is equally important to actually make it work (the implementation). My ex-advisor believed that it only really matters who was the first who published the idea.

My advisor scolded me in class for my views since he felt very strongly that the first idea counts and that my view is plain wrong. To redeem myself and to salvage the relationship with him, I felt coerced to change my blog post to his wishes.

This quasi-censorship of my blog post eviscerated me, and in consequence, I lost all desire to blog for two years. Despite my efforts, the relationship with my then advisor deteriorated further, and I had to look for a new advisor.

Looking back at the blog post that I produced, I feel ashamed. It does not express my personal views. I value integrity, and my behavior did not reflect who I want to be.

I write this blog post to discuss my true beliefs about credit assignment and why I believe that the idea, its communication and its implementation are all equally important.

Who Deserves Credit for Deep Learning Ideas?

There has been a lot of discussion about how to assign credit to researchers, or in other words, how to determine whose work had a large impact. Note that I do not discuss here who deserves credit for discovering an idea, I look at who deserves credit for the impact that an idea has. Looking at this, there are two main camps: The first believes that ideas and implementation count equally, and, the second believes that it counts who had the ideas first.

The problem with this discussion is that it is not a scientific topic, but a philosophical one. How do we determine what has how much value? We use the scientific method. What is the scientific method in philosophy? Use reductions to arrive at simple statements, then use logic to derive other factual statements, failing that — like in this case — we make thought experiments where we isolate variables which we then take to extremes. Let’s do this now to get insight into the issue.

All Ideas, No Communication, No Implementation

Let’s imagine there exists a person that has come up with all ideas in deep learning of the past and all ideas in deep learning of the future. However, this person cannot communicate with either words or writing. This person also cannot write code. How much credit deserves such a person?

I would argue that such a person deserves zero credit. In fact, I think it is epistemologically correct that this person deserves no credit because nobody can know that he or she deserves credit.

All Ideas, 1 Communication + No Ideas, Full Communication

We have a Person 1 that invented everything in deep learning. Now this person can communicate, but he or she is so unclear that only a single Person 2 can understand these ideas.

Now, Person 2 has no creativity but is a perfect communicator. Person 2 basically just translates what Person 1 said and the entire world understands. Who deserves credit here?

It is tempting to think that Person 2 deserves all the credit because Person 1 is useless without Person 2. But similarly, Person 2 is useless without Person 1.

Both people thus deserve equal credit — no one can achieve anything without the other.

All Ideas, Full Communication, 1 Implementation

Let’s increase the complexity of the problem. Let us say the duo of Person 1 and Person 2 spread the ideas so that the entire world understands deep learning, but let us assume that all people are implementation agnostic. Nobody can make deep learning work. The world knows about all deep learning ideas but cannot solve any problem with it. In such a world, the ideas of deep learning are quickly abandoned by the large majority due to their uselessness (just like the majority of the population does not care much about pure mathematics, e.g., few care if  a n + b n = c n  is true for all integer n >2).

Enter Person 3. Person 3 has no creativity, cannot communicate, but he or she can implement all the deep learning ideas in a practical manner. The world looks at this person’s code and suddenly is able to solve all problems which are solvable with deep learning.

Who deserves the most credit: Person 1, Person 2, or Person 3?

As discussed before, Person 1 and Person 2 deserve equal credit, and also here, I would argue, that Person 3 deserves equal credit.

This becomes apparent when we think about the value of ideas. Ideas are useful when they have an affect. If they have no or only a small effect they just deserve no recognition or little recognition. If deep learning ideas have no practical value then they would not deserve more recognition than, say, the idea that there might be something beyond the observable universe — it is a nice idea, but it will never produce anything of much value.

Comparative Individual Value For Collective Contributions

The evaluation changes if we distribute the contributions of ideas, communication, and implementation among many individuals. If we can take the three scenarios above, expand Person 1-3 into groups of people and subject them to comparative evaluation, that is, how much value the contributions of each individual has compared to all the other people have we arrive at the following thought experiment.

1 Ideas, 1000 Communication, 1000 Implementation

We have 1 person who has all the ideas, 1000 people who can understand these ideas and communicate them to the world, and 1000 people who can implement them to yield practical value, then how do we assign credit?

As discussed it is reasonable that each of the areas, (1) ideas, (2) communication, (3) implementation deserve equal credit. If now the groups of 1000 people made contributions (communications and implementations) of equal value, it would be fair to say that:

  • 1 Ideas: 1/3 credit
  • 1000 Communication: 1/3000 credit each
  • 1000 Implementation: 1/3000 credit each.

We see in this case the one person with the idea should receive the largest amount of credit.

Similarly, if we weight the numbers differently, and if we assume contributions of individuals in groups are equal, then this credit assignment holds for all other combinations like (1000, 1, 1000), or (10000, 1000, 1).

Timing and Relational Effects

In the real world, we have timing effects and relational effects. Not all 1000 Ideas, Communication, or Implementation people will publish their work at the same time, but they will have a specific sequence. In this sequence, they will influence and build on each other — they stand on the shoulders of giants. Who are the giants? Who deserves what amount of credit?

If we think about it, it is not much different than our first analysis. Lets take Person 1 that only has ideas and can communicate his or her ideas to only one other Person 2; Person 2, standing on Person 1’s shoulders, is only able to communicate the ideas to another person Person 3; Person 3, standing on Person 2’s shoulders, in turn, can communicate the ideas clearly to the entire world.

If we express the ability of people as numbers which represent the fraction of all value ideas, communication, and implementation we could weight Person 1, Person 2, and Person 3 in this way:

  • Person1: [1, 1/10^10, 0]
  • Person2: [0, 1/10^10, 0]
  • Person3: [0, 1, 0]

Which means that Person 1, has all the ideas (1), could communicate these ideas to 1 person (we assume a total population of 10 billion people to make the math easier). Person 2 has no ideas, could understand Person 1’s idea but could only communicate this idea to one other person, Person 3. Person 3 has no ideas, understands the idea of Person 2 and can communicate it so that everybody understands. Note that this example is simplified so that all people are implementation agnostic.

From these fractions, we see that Person 2 has almost no fraction of contributions since Person 2 is not creative and also not a good communicator. However, if we look at the relational effects we know Person 3 would have no value without Person 2, and Person 1 would also have no value without Person 2. So how do we solve this credit assignment problem?

We can try to solve this problem by expressing it as a weighted graph which expressed relationships over time and the relationships of the fractions with respect to the world.

what is assignment credit

How we weight the contribution of each person in this case? There are many answers to this, but here PageRank would be a good fit. PageRank works exactly as we discussed above, the credit is assigned comparatively, that is if we have a (1, 1000, 1000) distribution, the largest chunk of PageRank will be distributed by the single person. Thus it reflects our evaluation system. PageRank also takes into account the relationships between nodes and their recursive weight (standing on the shoulders of giants).

Using the scenario above, we find the contributions as follows:

We see that P2 has the largest contribution despite being only the bridge between P1 and P3 who have the largest fractions (all the ideas and full communication abilities). However, P1’s success depends on P2, and P3’s success depends on P2 and as such P2 is the most critical link in the entire system.

This is quite insightful. If you understand some obscure research and communicate this to just a few researchers who, in turn, influence many other researchers then you will have made a substantial contribution to the deep learning community.

It would not feel this way because you will probably not experience any fame or recognition here. The recognition will come for P1 (having ideas) and P3 (communicating ideas). But still, the numbers do not lie here.

This experiment was quite interesting, and if you want to experiment a bit by yourself, you can  download the code to see what happens if you add more people and more relationships among these people. This exercise can give quite some insight into what is valuable for research.

Response to Criticism on Reddit

There has been some sharp criticism on Reddit concerning ideas expressed in this blog post. The user metacurse makes the point that in science we credit usually those researchers who had the idea first and that communication and implementation are not valued. For example we value Albert Einstein more highly for the discovery of general relativity and the photoelectric effect and not its communication by Neil deGrasse Tyson; similarly, Cocks is credited for RSA even though he never implemented it in any way that was widely used (and he could not produce public implementations due to the classified status of RSA). However, this entire argument is rather weak and unfair:

  • I do not discuss who should be credited for an idea or the usage of the idea, I discuss who should be credited for the overall impact of an idea. These are very different questions.
  • He uses examples to try to prove his own hypothesis when we know that examples cannot prove anything  (he uses classical philosophic techniques, which has some value, but it does not generate any reliable knowledge like analytical philosophy does). He mocks me for not using examples myself.
  • He appeals to the emotion of the readers, by saying that my views endorse unethical ideas like “stealing olds ideas and rebranding them as your own” when it has nothing to do with my argument (reductio ad Hitlerum). He does this quite successfully swaying many emotional readers. I do not think this is helpful.

To make a sharper contrast why metacurse’s argument is not relevant to mine take this thought experiment.

We have a super genius who knows about all possible ideas and writes them down so that everybody can understand it easily. Then she locks these notes away in a locker and dies the next second. Over the next billions of years humanity rediscovers all ideas and uses them to build a flourishing society where all living things live in harmony and every being is fulfilled and so forth. One second before the last human dies in heat death, that human discovers the notebook.

Metacurse’s argument would look for the answer to the question: Should our super genius be credited for inventing everything? Metacurse would argue, yes, and I would totally agree.

What I discuss in this blog post: How much impact did our super genius have on the overall impact of all ideas? Very little, she never had any direct or even indirect effect with any of the ideas; the only impact she had was that one other person understood that she had the ideas before others had them. That is the total impact of her ideas. Her impact is almost zero.

Here I discussed how it is best to think about contributions in deep learning. From thought experiments, we could see that ideas, their communication, and their implementation are equally important contributions.

We also discussed how timing effects and dependencies could be modeled in a relational graph. We found that people that link ideas to communicators can make substantial contributions to the research community even if they themselves are not creative or good communicators. Creating the links between influential ideas and influential communicators (or people that implement) are important here.

Related Posts

How to Choose Your Grad School

Reader Interactions

Murray Frank says

2018-01-03 at 02:48

Giving credit is a long debated problem. Frequently someone comes up with an idea that has a huge influence. Then other people say that in reality someone else had really thought of the idea earlier. Often such claims are true. In other cases you can see the essence of the idea but not the whole thing in the earlier work. In some cases we retroactively give credit. In other cases it does not happen. For example, Kuhn and Tucker came up with a standard theorem in optimization in 1951. Eventually people realized that it was also in Karush’s 1939 master’s thesis. To this day you will see the theorem called the Kuhn-Tucker theorem, and you will also see it listed as the Karush-Kuhn-Tucker theorem. There are many such examples.

Tim Dettmers says

2018-01-15 at 22:49

There are many interesting examples indeed! Do you think this relates how past researchers communicated their work, or how “mature” their work is in general (master thesis vs full researchers).

2017-11-15 at 08:01

Hi, just found this blog, great stuff! Just a minor point – “Communication can be important even after publication. Just look at Immanuel Kant’s work, which is probably the most important philosophical work, yet it was not read for some time because nobody understood his ideas.” I find that very strange, not a good example at all. “Probably the most important philosophical work” – I don’t know what that’s based on. “Arguably”, arguably, but ‘probably’?! I’ve never heard anyone claim that. It’s news to me that Kant wasn’t read for some time. Whatever “some time” means. But I don’t think that’s right at all. And “nobody understood his ideas” is even more murky. (There’s not even a single thing you could point to and call “an understanding of his ideas”, i.e. there are a wide range of interpretations, even to this day. What one person calls an understanding, to another is gross misunderstanding, etc.) His Critiques have a repellent, almost impenetrable style, granted, maybe that’s what you meant. p.s. Gauss invented the FFT, apparently, though it seems he never told anyone, not sure how much credit he deserves. I kept expecting to see his name on these pages in that connection. 🙂

2017-11-16 at 22:49

I am talking about the “Critique of Pure Reason” here. Kant published it, and it was poorly received because people could not understand it. He rewrote it 6 years later, and suddenly people could actually understand his points, which in turn could help other people understand. Through this Kant became the most talked-about philosopher during that time.

Karthikeyan Chittayil says

2017-09-30 at 07:53

Tim, I think you have a nice way of putting complex concepts in simple words, and elementary maths. Please keep it up. As you have brought it out, communication indeed is very important. Keep blogging !

2017-10-01 at 15:25

Thank you, that means a lot of me!

Yun Teng says

2017-09-28 at 03:16

Enlightening as always! The saying “Those who can, do, those who can’t, teach” has always bothered me. Because of that, I really liked your “Timing and Relational Effects” example with the PageRank, which showed that Person 2 was the most important, and even Person 3 had 0.2305 contribution. To me, Person 2 is like a mentor/advisor and Person 3 is an instructor with many students, both roles having significant impact in the real world.

2017-09-29 at 14:52

Indeed, I think this is a good way to think about Person 2 and Person 3.

Alison B Lowndes says

2017-09-18 at 12:59

I will read this in full when I get chance but just wanted to add that if I’d listened to my Supervisor I’d have researched neural networks on CPU! I didn’t listen to him – which is lucky – because he also told me I’d never be able to recognise features in histology images!? Its a tough world out there so you just have to learn to be humble and courageous at the same time. PS My Supervisor also told me to steer clear of your (ex) Supervisor! PPS We still want to hire you!

2017-09-25 at 15:12

Thanks for your comment, Alison. I really appreciate it! Indeed it can be messy with the wrong supervisor, but I must also say that it was a good experience for me since I learned a lot from that experience. With that, I will be able to make a better choice for my PhD advisor. So in the end, it was not so bad after all!

2017-09-18 at 01:26

Hmm in other fields a lot of credit is given to the original person who came up with it, even if it wasn’t used or popularized right away. Like in computer graphics you give credit to the mathematician who came up with quaternions even if (as far as I know) they weren’t used for years any where else. It was just some obscure math. Likewise the guy who came up with plate tectonics was considered a quack when it was introduced, yet years later when we accept it we give him credit (even if he couldn’t popularize it). I think in a sense the purpose of academia and universities is to go beyond what’s necessarily useful today, to explore the far off distance, even if it isn’t worth popularizing right now (because there’s no use for it).

My understanding of Deep Learning is a lot of it got popularized due to faster computing machines, in particular GPUs. Certainly I believe the person who implemented DL on GPUs deserves a lot of credit for it, but I wouldn’t dismiss people who came beforehand with ideas because they didn’t implement it right away. (Actually this is kind of inspiring me to take a look into who first decided to use quaternions in graphics to see interesting early things they may have done.)

I was thinking maybe you’re coming from more of a corporate standpoint, where all that matters to you is utilization. But even in the corporate world credit to obscure ideas is given. An example is Apple. Popularizing GUIs and what have you. But we still give credit to Xerox, and even in interviews Steve Jobs discusses this!

In your examples you give this idea about someone being unable to communicate their ideas to the world. That makes sense to me, if they couldn’t get it out and it remained so obscure that it only remained in their minds, they probably don’t deserve much credit (like you say there wouldn’t even be proof). But if someone gets a publication out, that is no longer obscure, and I would say that’s a worthy of credit assignment.

2017-09-18 at 11:34

You talk about who to credit for an idea. This blog post does not discuss this topic. This blog post discusses how the impact of the idea is distributed among people and thus how much credit people should receive.

Xerox, of course, should be credited with the idea of the GUI. It was their original research. But who gets how much credit for the impact the idea of a GUI had over time?

Communication can be important even after publication. Just look at Immanuel Kant’s work, which is probably the most important philosophical work, yet it was not read for some time because nobody understood his ideas. It was similar for the LSTM. People just could not understand the paper and thus the significance of LSTMs.

Note that all these are mere examples which do not yield any reliable knowledge. You can look at it with the scientific method from other disciplines too, and I think this would be a better way to contribute to this discussion.

For example, in social network analysis similar effects as I describe here are well known (central nodes in a network are strong even though their only merit is their network connectivity itself). You can see similar things in some games in game theory. This can be used to describe these effects mathematically and thus I believe these theories are better than using examples which have a hard time to prove an argument.

2017-09-18 at 14:22

Ah sorry, I think I misunderstood your blog post originally, thought you were dismissing original credit. Impact isn’t something I have thought about seriously, and I think the topic is something that could easily be brushed aside for the status quo with lazy statements like “impact isn’t something I have thought about seriously” or with hostility to change. So with that said I think it’s good you’re questioning credit assignment, even if you are met with a lot of hostility. So thank you.

I agree communication is important. I am very new to deep learning, and I find the initiatives within the field for improving communication to be extremely inspiring and helpful to me. Including your own work, especially your last blog post about research direction and computational efficiency. So thank you and I hope you continue to write.

Rein Halbersma says

2017-09-16 at 21:36

Nice post! You could also interpret the credit assignment problem as a bargaining game in which each player bargains over the deployment of its assets (ideas, communication, implementation) to create something of value. Applying the tools from cooperative game theory, I would expect a solution concept like the Shapley-value to emerge as a fair credit assignment. Linking pins such as communicators connecting different communities also have great value in such bargaining games.

2017-09-16 at 21:56

Thanks for your comment — this is a very interesting analogy! I think something like the Shapley-value and its problem fit this entire problem quite well and I would expect the solutions to be quite similar.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Notify me of follow-up comments by email.

Notify me of new posts by email.

Help | Advanced Search

Computer Science > Machine Learning

Title: a survey of temporal credit assignment in deep reinforcement learning.

Abstract: The Credit Assignment Problem (CAP) refers to the longstanding challenge of Reinforcement Learning (RL) agents to associate actions with their long-term consequences. Solving the CAP is a crucial step towards the successful deployment of RL in the real world since most decision problems provide feedback that is noisy, delayed, and with little or no information about the causes. These conditions make it hard to distinguish serendipitous outcomes from those caused by informed decision-making. However, the mathematical nature of credit and the CAP remains poorly understood and defined. In this survey, we review the state of the art of Temporal Credit Assignment (CA) in deep RL. We propose a unifying formalism for credit that enables equitable comparisons of state-of-the-art algorithms and improves our understanding of the trade-offs between the various methods. We cast the CAP as the problem of learning the influence of an action over an outcome from a finite amount of experience. We discuss the challenges posed by delayed effects, transpositions, and a lack of action influence, and analyse how existing methods aim to address them. Finally, we survey the protocols to evaluate a credit assignment method and suggest ways to diagnose the sources of struggle for different methods. Overall, this survey provides an overview of the field for new-entry practitioners and researchers, it offers a coherent perspective for scholars looking to expedite the starting stages of a new study on the CAP, and it suggests potential directions for future research.

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. What Is the Credit Assignment Problem?

    what is assignment credit

  2. How to Manually Provide Credit for an Assignment

    what is assignment credit

  3. PPT

    what is assignment credit

  4. Assignment (Credit)

    what is assignment credit

  5. What are assignments, how do I get credit for them and how do I submit

    what is assignment credit

  6. Extra Credit Assignment Ideas that Support Student Learning

    what is assignment credit

VIDEO

  1. Extra Credit Assignment

  2. ASL Extra Credit Assignment, Language Background

  3. Dave Ramsey and The Credit Crush Assignment Week 6 Personal Finance University of Houston

  4. Chapter 3 Extra Credit Assignment: Self Concept Bag

  5. Credit Assignment 1 FOUN1001 S324 Group 15

  6. Part 1 of Extra credit assignment 3