Since the existing visual question answering model lacks long-term memory modules for answering complex questions, it is easy to cause the loss of effective information. In order to further improve ...