Something that surprised me in the conversation that Prof. Goodman had with Susan Murphy and Brendan Meade was the discussion of how humanity’s access to vast quantities of data and unparalleled computing power has changed the way “science” is done. It was fascinating to me how Dr. Meade described how earth scientists studying earthquakes try to remain “humble,” in the sense that they don’t claim to know all of the exact theoretical physics a priori. In past discussions of simulation in our course, we’ve emphasized the importance of the models on which these simulations are based, and how these models both directly reflect and parametrize the complex processes occurring in their settings. The idea that current science and future simulations could be based on recurring patterns before theoretical models and equations is an interesting one.
If I could add a question to the conversation between the three scientists, I would ask what they see as the possible shortcoming of this new “data science” approach to prediction. The group spoke briefly about how some of the “features” identified in data science algorithms are not entirely interpretable/communicable on the human scale/with human language. Is it possible that phenomena such as this compound into more and more abstract representations of the world or setting at hand? How do they manage the risk of relying on these algorithms that possibly miss the forest for the trees? Is it the case that as long as the outcome is “accurate,” we trust these models and predictions even if we do not know their exact machinery?
I really enjoyed reading your analysis of the conversation with Dr. Murphy and Dr. Meade, Tara! I think that it is really interesting that this video discusses the impact of big data collection and stronger computing power on making more accurate predictions. The video that I watched with Dr. De Vivo and Dr. Kraft noted that sometimes having more information does not necessarily lead to better prediction models, especially when it comes to human health, since there are just so many factors that are variable from person to person and so the model would necessarily need to change each time. Likewise, biases and missing information in datasets could potentially lead to skewed models, and so I am wondering if the evaluated accuracy of a certain model is dependent on the kind of data we are using for these systems. The video that I watched also discussed the importance of communicating the uncertainty of these models to the public to qualify whether or not a certain prediction tool should be relied upon/used in a given medical scenario. Therefore, I think it's very interesting that Dr. Murphy and Dr. Meade's points about how "new data science" essentially gets rid of that base understanding of how these models are working, leads to further discussions about how we go about communicating our findings to the public if we do not know how such conclusions are truly being made.