This function takes a search pattern (or regular expression) and returns all quotes that match the pattern.
A convenient wrapper for search quotes that by default returns all quotes
Usage
search_quotes(
search,
ignore_case = TRUE,
fuzzy = FALSE,
fields = c("text", "source", "tags", "cite"),
...
)
search_text(search, fuzzy = FALSE, ...)
get_quotes(search = ".*", ...)
Arguments
- search
A character string or regex pattern to search the database.
- ignore_case
If
TRUE
, matching is done without regard to case.- fuzzy
If
TRUE
, useagrep
to allow approximate matches to the search string.- fields
A character vector of the particular fields to search. The default is
c("text","source","tags")
. You can use the shortcutfields="all"
to search all fields (including citation, url).- ...
additional arguments passed to
agrep
to fine-tune fuzzy search parameters.
Value
A data frame (also with class 'statquote'
) object containing all quotes that match the search parameters.
A data frame (also with class 'statquote'
) object
containing all quotes.
Examples
search_quotes("^D") # regex to find all quotes that start with "D"
#>
#> Did you ever see such a thing as a drawing of a muchness?
#> --- Lewis Carroll, Alice in Wonderland
#>
#> Do not put faith in what statistics say until you have carefully
#> considered what they do not say.
#> --- William W. Watt
#>
#> Data analysis is an aid to thinking and not a replacement for.
#> --- Richard Shillington
#>
#> Discovering the unexpected is more important than confirming the known.
#> --- George E. P. Box
#>
#> Direction is more important than speed. We are so busy looking at our
#> speedometers that we forget the milestone.
#> --- Anonymous
#>
#> DIAGRAMS are of great utility for illustrating certain questions of
#> vital statistics by conveying ideas on the subject through the eye,
#> which cannot be so readily grasped when contained in figures.
#> --- Florence Nightingale, Mortality of the British Army, 1857
#>
#> Doubt is not a pleasant mental state, but certainty is a ridiculous
#> one.
#> --- Voltaire (1694-1778)
#>
#> Doing applied statistics is never easy, especially if you want to get
#> it right.
#> --- Xiao-Li Meng, 2005 Joint Statistical Meetings
#>
#> Data analysis is a tricky business -- a trickier business than even
#> tricky data analysts sometimes think.
#> --- Bert Gunter, S-news mailing list, 6 Jun 2002
#>
#> Dodge (2003) provided a definition of 'outlier' that is helpful but far
#> from complete: In a sample of N observations, it is possible for a
#> limited number to be so far separated in value from the remainder that
#> they give rise to the question whether they are not from a different
#> population, or that the sampling technique is at fault. Such values are
#> called outliers.
#> --- David Finney, Calibration Guidelines Challenge Outlier Practices,
#> The American Statistician, 2006, Vol 60, No 4, p. 310.
#>
#> Despite the stranglehold that hypothesis testing has on experimental
#> psychology, I find it difficult to imagine a less insightful means of
#> transitting from data to conclusions.
#> --- G. R. Loftus, On the tyranny of hypothesis testing in the social
#> sciences. Contemporary Psychology 36: 102-105.
#>
#> Data analysis methods in psychology still emphasize statistical
#> significance testing, despite numerous articles demonstrating its
#> severe deficiencies. It is now possible to use meta-analysis to show
#> that reliance on significance testing retards the development of
#> cumulative knowledge. The reform of teaching and practice will also
#> require that researchers learn that the benefits that they believe flow
#> from use of significance testing are illusory. Teachers must re-vamp
#> their courses to bring students to understand that a) reliance on
#> significance testing retards the growth of cumulative research
#> knowledge; b) benefits widely believed to flow from significance
#> testing do not in fact exist; c) significance testing methods must be
#> replaced with point estimates and confidence intervals in individual
#> studies and with meta-analyses and the integration of multiple studies.
#> This reform is essential to the future progress of cumulative knowledge
#> and psychological research.
#> --- Frank L. Schmidt, Statistical significance testing and cumulative
#> knowledge in psychology: implications for training of researchers.
#> Psychological Methods 1(2), Jun 1996, 115-129.
#>
#> Despite widespread misconceptions to the contrary, the rejection of a
#> given null hypothesis gives us no basis for estimating the probability
#> that a replication of the research will again result in rejecting that
#> null hypothesis.
#> --- Jacob Cohen, Things I have learned (so far), 1990. American
#> Psychologist 45: 1304-1312.
#>
#> Data visualization is part art and part science. The challenge is to
#> get the art right without getting the science wrong and vice versa.
#> --- Claus O. Wilke, Fundamentals of Data Visualization
#>
#> Data in isolation are meaningless, a collection of numbers. Only in
#> context of a theory do they assume significance...
#> --- George Greenstein, "Frozen Star", 1983
#>
#> Data are not just numbers, they are numbers with a context. ... In data
#> analysis, context provides meaning.
#> --- George W. Cobb & David S. Moore, Mathematics, Statistics, and
#> Teaching. American Mathematical Monthly, 801-23.
search_quotes("Tukey") # all quotes with "Tukey"
#>
#> The greatest value of a picture is when it forces us to notice what we
#> never expected to see.
#> --- John W. Tukey, Exploratory Data Analysis, 1977
#>
#> If data analysis is to be well done, much of it must be a matter of
#> judgment, and 'theory' whether statistical or non-statistical, will
#> have to guide, not command.
#> --- John W. Tukey, The Future of Data Analysis, Annals of Mathematical
#> Statistics, Vol. 33 (1), 1962.
#>
#> The physical sciences are used to 'praying over' their data, examining
#> the same data from a variety of points of view. This process has been
#> very rewarding, and has led to many extremely valuable insights.
#> Without this sort of flexibility, progress in physical science would
#> have been much slower. Flexibility in analysis is often to be had
#> honestly at the price of a willingness not to demand that what has
#> already been observed shall establish, or prove, what analysis
#> suggests. In physical science generally, the results of praying over
#> the data are thought of as something to be put to further test in
#> another experiment, as indications rather than conclusions.
#> --- John W. Tukey, The Future of Data Analysis, Annals of Mathematical
#> Statistics, Vol. 33 (1), 1962.
#>
#> If one technique of data analysis were to be exalted above all others
#> for its ability to be revealing to the mind in connection with each of
#> many different models, there is little doubt which one would be chosen.
#> The simple graph has brought more information to the data analyst's
#> mind than any other device. It specializes in providing indications of
#> unexpected phenomena.
#> --- John W. Tukey, The Future of Data Analysis, The Annals of
#> Mathematical Statistics, Vol. 33, No. 1 (Mar., 1962), pp. 1-67.
#>
#> The greatest possibilities of visual display lie in vividness and
#> inescapability of the intended message. A visual display can stop your
#> mental flow in its tracks and make you think. A visual display can
#> force you to notice what you never expected to see.
#> --- John W. Tukey
#>
#> The purpose of [data] display is comparison (recognition of phenomena),
#> not numbers ... The phenomena are the main actors, numbers are the
#> supporting cast.
#> --- John W. Tukey
#>
#> ...But it is not always clear *which* 1000 words.
#> --- John W. Tukey, 1973
#>
#> Exploratory data analysis can never be the whole story, but nothing
#> else can serve as the foundation stone -- as the first step.
#> --- John W. Tukey, Exploratory Data Analysis, 1977, p.3.
#>
#> The best thing about being a statistician is that you get to play in
#> everyone's backyard.
#> --- John W. Tukey
#>
#> Far better an approximate answer to the right question, which is often
#> vague, than an exact answer to the wrong question, which can always be
#> made precise.
#> --- John W. Tukey, The Future of Data Analysis, The Annals of
#> Mathematical Statistics, Vol. 33, No. 1 (Mar., 1962), pp. 1-67.
#>
#> The worst, i.e., most dangerous, feature of 'accepting the null
#> hypothesis' is the giving up of explicit uncertainty ... Mathematics
#> can sometimes be put in such black-and-white terms, but our knowledge
#> or belief about the external world never can.
#> --- John W. Tukey, The Philosophy of Multiple Comparisons, Statist.
#> Sci. 6 (1) 100 - 116, February, 1991.
#>
#> Better to have an approximate answer to the right question than a
#> precise answer to the wrong question.
#> --- John W. Tukey, Quoted by John Chambers
#>
#> The combination of some data and an aching desire for an answer does
#> not ensure that a reasonable answer can be extracted from a given body
#> of data.
#> --- John W. Tukey, Sunset Salvo, The American Statistician Vol. 40 (1),
#> 1986.
#>
#> The practical power of a statistical test is the product of its'
#> statistical power and the probability of use.
#> --- John W. Tukey, A Quick, Compact, Two Sample Test to Duckworth's
#> Specifications
#>
#> Since the aim of exploratory data analysis is to learn what seems to
#> be, it should be no surprise that pictures play a vital role in doing
#> it well.
#> --- John W. Tukey, John W. Tukey's Works on Interactive Graphics. The
#> Annals of Statistics Vol. 30, No. 6 (Dec., 2002), pp. 1629-1639
#>
#> There is nothing better than a picture for making you think of
#> questions you had forgotten to ask (even mentally).
#> --- John W. Tukey & Paul Tukey, John W. Tukey's Works on Interactive
#> Graphics. The Annals of Statistics Vol. 30, No. 6 (Dec., 2002), pp.
#> 1629-1639
#>
#> Unless exploratory data analysis uncovers indications, usually
#> quantitative ones, there is likely to nothing for confirmatory data
#> analysis to consider.
#> --- John Tukey, Exploratory Data Analysis, p. 3.
#>
#> One thing the data analyst has to learn is how to expose himself to
#> what his data are willing--or even anxious--to tell him. Finding clues
#> requires looking in the right places and with the right magnifying
#> glass.
#> --- John Tukey, Exploratory Data Analysis, p. 21.
#>
#> In data analysis, a plot of y against x may help us when we know
#> nothing about the logical connection from x to y--even when we do not
#> know whether or not there is one--even when we know that such a
#> connection is impossible.
#> --- John Tukey, Exploratory Data Analysis, p. 131.
#>
#> Whatever the data, we can try to gain understanding by straightening or
#> by flattening. When we succeed in doing one or both, we almost always
#> see more clearly what is going on.
#> --- John Tukey, Exploratory Data Analysis, p. 148.
#>
#> A competent data analysis of an even moderately complex set of data is
#> a thing of trials and retreats, of dead ends and branches.
#> --- John Tukey, Computer Science and Statistics: Proceedings of the
#> 14th Symposium on the Interface, p. 4.
#>
#> The purpose of plotting is to convey phenomena to the viewer's cortex,
#> not to provide a place to lookup observed numbers.
#> --- Kaye Basford, John Tukey, Graphical Analysis of Multi-Response
#> Data, p. 373.
#>
#> Had we started with this [quantile] plot, noticed that it looks
#> straight and not looked further, we would have missed the important
#> features of the data. The general lesson is important. Theoretical
#> quantile-quantile plots are not a panacea and must be used in
#> conjunction with other displays and analyses to get a full picture of
#> the behavior of the data.
#> --- John M. Chambers, William S. Cleveland, Beat Kleiner, Paul A.
#> Tukey, Graphical Methods for Data Analysis, p. 212.
#>
#> Our conclusion about [choropleth] patch maps agrees with Tukey's
#> (1979), who left little doubt about his opinions by stating, 'I am
#> coming to be less and less satisfied with the set of maps that some
#> dignify by the name *statistical map* and that I would gladly revile
#> with the name *patch map*'.
#> --- William Cleveland & Robert McGill, Graphical Perception: Theory,
#> Experimentation, and Application to the Development of Graphical
#> Models, Journal of the American Statistical Association, 79, 531--554,
#> 1984.
#>
#> There is no more reason to expect one graph to 'tell all' than to
#> expect one number to do the same.
#> --- John Tukey, Exploratory Data Analysis.
#>
#> There is no excuse for failing to plot and look.
#> --- John Tukey, Exploratory Data Analysis
#>
#> Spatial patterns may be due to many sources of variation. In the
#> context of seeking explanations, John Tukey said that, "the unadjusted
#> plot should not be made." In other words, our perceptual/cognitive
#> abilities are poor in terms of adjusting for known source of variations
#> and envisioning the resulting map. A better strategy is to control for
#> known sources of variation and/or adjust the estimates before making
#> the map.
#> --- Dan Carr, Survey Research Methods Section newsletter, July 2002.
#>
#> One is so much less than two. [John Tukey's eulogy of his wife.]
#> --- John Tukey, The life and professional contributions of John W.
#> Tukey, The Annals of Statistics, 2001, Vol 30, p. 46.
#>
#> Statisticians classically asked the wrong question--and were willing to
#> answer with a lie, one that was often a downright lie. They asked "Are
#> the effects of A and B different?" and they were willing to answer
#> "no". All we know about the world teaches us that the effects of A and
#> B are always different--in some decimal place--for every A and B. Thus
#> asking "Are the effects different?" is foolish. What we should be
#> answering first is "Can we tell the direction in which the effects of A
#> differ from the effects of B?" In other words, can we be confident
#> about the direction from A to B? Is it "up", "down" or "uncertain"?
#> --- John Tukey, The Philosophy of Multiple Comparisons, Statistical
#> Science, 6, 100-116.
#>
#> No one has ever shown that he or she had a free lunch. Here, of course,
#> "free lunch" means "usefulness of a model that is locally easy to make
#> inferences from".
#> --- John Tukey, Issues relevant to an honest account of data-based
#> inference, partially in the light of Laurie Davies' paper.
#>
#> If asymptotics are of any real value, it must be because they teach us
#> something useful in finite samples. I wish I knew how to be sure when
#> this happens.
#> --- John Tukey, Issues relevant to an honest account of data-based
#> inference, partially in the light of Laurie Davies' paper.
#>
#> George Box: We don't need robust methods. A good statistician
#> (particularly a Bayesian one) will model the data well and find the
#> outliers. John Tukey: They ran over 2000 statistical analyses at
#> Rothamsted last week and nobody noticed anything. A red light warning
#> would be most helpful.
#> --- George Box vs. John Tukey, Douglas Martin, 1999 S-Plus Conference
#> Proceedings.
#>
#> Statistics is a science in my opinion, and it is no more a branch of
#> mathematics than are physics, chemistry, and economics; for if its
#> methods fail the test of experience--not the test of logic--they will
#> be discarded.
#> --- John Tukey, The life and professional contributions of John W.
#> Tukey, by David Brillinger, The Annals of Statistics, 2001, Vol 30.
#>
#> One Christmas Tukey gave his students books of crossword puzzles as
#> presents. Upon examining the books the students found that Tukey had
#> removed the puzzle answers and had replaced them with words of the
#> sense: "Doing statistics is like doing crosswords except that one
#> cannot know for sure whether one has found the solution."
#> --- John Tukey, The life and professional contributions of John W.
#> Tukey, by David Brillinger, The Annals of Statistics, 2001, Vol 30, p.
#> 22.
#>
#> A sort of question that is inevitable is: "Someone taught my students
#> exploratory, and now (boo hoo) they want me to tell them how to assess
#> significance or confidence for all these unusual functions of the data.
#> Oh, what can we do?" To this there is an easy answer: TEACH them the
#> JACKKNIFE.
#> --- John Tukey, We Need Both Exploratory and Confirmatory, The American
#> Statistician, Vol 34, No 1, p. 25.
#>
#> John Tukey's eye for detail was amazing. When we were preparing some of
#> the material for our book (which was published last year), it was most
#> disconcerting to have him glance at the data and question one value out
#> of several thousand points. Of course, he was correct and I had missed
#> identifying this anomaly.
#> --- Kaye Basford
#>
#> Many students are curious about the '1.5 x IQR Rule';, i.e. why do we
#> use Q1 - 1.5 x IQR (or Q3 + 1.5 x IQR) as the value for deciding if a
#> data value is classified as an outlier? Paul Velleman, a statistician
#> at Cornell University, was a student of John Tukey, who invented the
#> boxplot and the 1.5 x IQR Rule. When he asked Tukey, 'Why 1.5?', Tukey
#> answered, 'Because 1 is too small and 2 is too large.' [Assuming a
#> Gaussian distribution, about 1 value in 100 would be an outlier. Using
#> 2 x IQR would lead to 1 value in 1000 being an outlier.]
#> --- Unknown
#>
#> It is a rare thing that a specific body of data tells us as clearly as
#> we would wish how it itself should be analyzed.
#> --- John Tukey, Exploratory Data Analysis, p. 397.
#>
#> Just which robust/resistant methods you use is not important--what is
#> important is that you use some. It is perfectly proper to use both
#> classical and robust/resistant methods routinely, and only worry when
#> they differ enough to matter. But, when they differ, you should think
#> hard.
#> --- John Tukey, Quoted by Doug Martin
#>
#> I believe that there are many classes of problems where Bayesian
#> analyses are reasonable, mainly classes with which I have little
#> acquaintance.
#> --- John Tukey, The life and professional contributions of John W.
#> Tukey, The Annals of Statistics, 2001, Vol 30, p. 45.
#>
#> The twin assumptions of normality of distribution and homogeneity of
#> variance are not ever exactly fulfilled in practice, and often they do
#> not even hold to a good approximation.
#> --- John W. Tukey, The problem of multiple comparisons. 1973.
#> Unpublished manuscript, Dept. of Statistics, Princeton University.
#>
#> [A]sking 'Are the effects different?' is foolish.
#> --- John W. Tukey, The philosophy of multiple comparisons. 1991.
#> Statistical Science 6 : 100-116.
#>
#> Empirical knowledge is always fuzzy! And theoretical knowledge, like
#> all the laws of physics, as of today's date, is always wrong-in detail,
#> though possibly providing some very good approximations indeed.
#> --- John W. Tukey, The philosophy of multiple comparisons. 1991.
#> Statistical Science 6 : 100-116.
#>
#> If significance tests are required for still larger samples, graphical
#> accuracy is insufficient, and arithmetical methods are advised. A word
#> to the wise is in order here, however. Almost never does it make sense
#> to use exact binomial significance tests on such data - for the
#> inevitable small deviations from the mathematical model of independence
#> and constant split have piled up to such an extent that the binomial
#> variability is deeply buried and unnoticeable. Graphical treatment of
#> such large samples may still be worthwhile because it brings the
#> results more vividly to the eye.
#> --- Frederick Mosteller & John W Tukey, The Uses and Usefulness of
#> Binomial Probability Paper, Journal of the American Statistical
#> Association 44, 1949.
#>
#> If we need a short suggestion of what exploratory data analysis is, I
#> would suggest that: 1. it is an attitude, AND 2. a flexibility, AND 3.
#> some graph paper (or transparencies, or both).
#> --- John W. Tukey, Jones, L. V. (Ed.). (1986). The collected works of
#> John W. Tukey: Philosophy and principles of data analysis 1949-1964
#> (Vols. III & IV). London: Chapman & Hall.
#>
#> Three of the main strategies of data analysis are: 1. graphical
#> presentation. 2. provision of flexibility in viewpoint and in
#> facilities, 3. intensive search for parsimony and simplicity.
#> --- John W. Tukey, Jones, L. V. (Ed.). (1986). The collected works of
#> John W. Tukey: Philosophy and principles of data analysis 1949-1964
#> (Vols. III & IV). London: Chapman & Hall.
search_quotes("Turkey", fuzzy = TRUE) # fuzzy match
#>
#> The greatest value of a picture is when it forces us to notice what we
#> never expected to see.
#> --- John W. Tukey, Exploratory Data Analysis, 1977
#>
#> If data analysis is to be well done, much of it must be a matter of
#> judgment, and 'theory' whether statistical or non-statistical, will
#> have to guide, not command.
#> --- John W. Tukey, The Future of Data Analysis, Annals of Mathematical
#> Statistics, Vol. 33 (1), 1962.
#>
#> The physical sciences are used to 'praying over' their data, examining
#> the same data from a variety of points of view. This process has been
#> very rewarding, and has led to many extremely valuable insights.
#> Without this sort of flexibility, progress in physical science would
#> have been much slower. Flexibility in analysis is often to be had
#> honestly at the price of a willingness not to demand that what has
#> already been observed shall establish, or prove, what analysis
#> suggests. In physical science generally, the results of praying over
#> the data are thought of as something to be put to further test in
#> another experiment, as indications rather than conclusions.
#> --- John W. Tukey, The Future of Data Analysis, Annals of Mathematical
#> Statistics, Vol. 33 (1), 1962.
#>
#> If one technique of data analysis were to be exalted above all others
#> for its ability to be revealing to the mind in connection with each of
#> many different models, there is little doubt which one would be chosen.
#> The simple graph has brought more information to the data analyst's
#> mind than any other device. It specializes in providing indications of
#> unexpected phenomena.
#> --- John W. Tukey, The Future of Data Analysis, The Annals of
#> Mathematical Statistics, Vol. 33, No. 1 (Mar., 1962), pp. 1-67.
#>
#> The greatest possibilities of visual display lie in vividness and
#> inescapability of the intended message. A visual display can stop your
#> mental flow in its tracks and make you think. A visual display can
#> force you to notice what you never expected to see.
#> --- John W. Tukey
#>
#> The purpose of [data] display is comparison (recognition of phenomena),
#> not numbers ... The phenomena are the main actors, numbers are the
#> supporting cast.
#> --- John W. Tukey
#>
#> ...But it is not always clear *which* 1000 words.
#> --- John W. Tukey, 1973
#>
#> Exploratory data analysis can never be the whole story, but nothing
#> else can serve as the foundation stone -- as the first step.
#> --- John W. Tukey, Exploratory Data Analysis, 1977, p.3.
#>
#> The best thing about being a statistician is that you get to play in
#> everyone's backyard.
#> --- John W. Tukey
#>
#> Far better an approximate answer to the right question, which is often
#> vague, than an exact answer to the wrong question, which can always be
#> made precise.
#> --- John W. Tukey, The Future of Data Analysis, The Annals of
#> Mathematical Statistics, Vol. 33, No. 1 (Mar., 1962), pp. 1-67.
#>
#> The worst, i.e., most dangerous, feature of 'accepting the null
#> hypothesis' is the giving up of explicit uncertainty ... Mathematics
#> can sometimes be put in such black-and-white terms, but our knowledge
#> or belief about the external world never can.
#> --- John W. Tukey, The Philosophy of Multiple Comparisons, Statist.
#> Sci. 6 (1) 100 - 116, February, 1991.
#>
#> Better to have an approximate answer to the right question than a
#> precise answer to the wrong question.
#> --- John W. Tukey, Quoted by John Chambers
#>
#> The combination of some data and an aching desire for an answer does
#> not ensure that a reasonable answer can be extracted from a given body
#> of data.
#> --- John W. Tukey, Sunset Salvo, The American Statistician Vol. 40 (1),
#> 1986.
#>
#> The practical power of a statistical test is the product of its'
#> statistical power and the probability of use.
#> --- John W. Tukey, A Quick, Compact, Two Sample Test to Duckworth's
#> Specifications
#>
#> Since the aim of exploratory data analysis is to learn what seems to
#> be, it should be no surprise that pictures play a vital role in doing
#> it well.
#> --- John W. Tukey, John W. Tukey's Works on Interactive Graphics. The
#> Annals of Statistics Vol. 30, No. 6 (Dec., 2002), pp. 1629-1639
#>
#> There is nothing better than a picture for making you think of
#> questions you had forgotten to ask (even mentally).
#> --- John W. Tukey & Paul Tukey, John W. Tukey's Works on Interactive
#> Graphics. The Annals of Statistics Vol. 30, No. 6 (Dec., 2002), pp.
#> 1629-1639
#>
#> Unless exploratory data analysis uncovers indications, usually
#> quantitative ones, there is likely to nothing for confirmatory data
#> analysis to consider.
#> --- John Tukey, Exploratory Data Analysis, p. 3.
#>
#> One thing the data analyst has to learn is how to expose himself to
#> what his data are willing--or even anxious--to tell him. Finding clues
#> requires looking in the right places and with the right magnifying
#> glass.
#> --- John Tukey, Exploratory Data Analysis, p. 21.
#>
#> In data analysis, a plot of y against x may help us when we know
#> nothing about the logical connection from x to y--even when we do not
#> know whether or not there is one--even when we know that such a
#> connection is impossible.
#> --- John Tukey, Exploratory Data Analysis, p. 131.
#>
#> Whatever the data, we can try to gain understanding by straightening or
#> by flattening. When we succeed in doing one or both, we almost always
#> see more clearly what is going on.
#> --- John Tukey, Exploratory Data Analysis, p. 148.
#>
#> A competent data analysis of an even moderately complex set of data is
#> a thing of trials and retreats, of dead ends and branches.
#> --- John Tukey, Computer Science and Statistics: Proceedings of the
#> 14th Symposium on the Interface, p. 4.
#>
#> The purpose of plotting is to convey phenomena to the viewer's cortex,
#> not to provide a place to lookup observed numbers.
#> --- Kaye Basford, John Tukey, Graphical Analysis of Multi-Response
#> Data, p. 373.
#>
#> Had we started with this [quantile] plot, noticed that it looks
#> straight and not looked further, we would have missed the important
#> features of the data. The general lesson is important. Theoretical
#> quantile-quantile plots are not a panacea and must be used in
#> conjunction with other displays and analyses to get a full picture of
#> the behavior of the data.
#> --- John M. Chambers, William S. Cleveland, Beat Kleiner, Paul A.
#> Tukey, Graphical Methods for Data Analysis, p. 212.
#>
#> Our conclusion about [choropleth] patch maps agrees with Tukey's
#> (1979), who left little doubt about his opinions by stating, 'I am
#> coming to be less and less satisfied with the set of maps that some
#> dignify by the name *statistical map* and that I would gladly revile
#> with the name *patch map*'.
#> --- William Cleveland & Robert McGill, Graphical Perception: Theory,
#> Experimentation, and Application to the Development of Graphical
#> Models, Journal of the American Statistical Association, 79, 531--554,
#> 1984.
#>
#> There is no more reason to expect one graph to 'tell all' than to
#> expect one number to do the same.
#> --- John Tukey, Exploratory Data Analysis.
#>
#> There is no excuse for failing to plot and look.
#> --- John Tukey, Exploratory Data Analysis
#>
#> Spatial patterns may be due to many sources of variation. In the
#> context of seeking explanations, John Tukey said that, "the unadjusted
#> plot should not be made." In other words, our perceptual/cognitive
#> abilities are poor in terms of adjusting for known source of variations
#> and envisioning the resulting map. A better strategy is to control for
#> known sources of variation and/or adjust the estimates before making
#> the map.
#> --- Dan Carr, Survey Research Methods Section newsletter, July 2002.
#>
#> One is so much less than two. [John Tukey's eulogy of his wife.]
#> --- John Tukey, The life and professional contributions of John W.
#> Tukey, The Annals of Statistics, 2001, Vol 30, p. 46.
#>
#> Statisticians classically asked the wrong question--and were willing to
#> answer with a lie, one that was often a downright lie. They asked "Are
#> the effects of A and B different?" and they were willing to answer
#> "no". All we know about the world teaches us that the effects of A and
#> B are always different--in some decimal place--for every A and B. Thus
#> asking "Are the effects different?" is foolish. What we should be
#> answering first is "Can we tell the direction in which the effects of A
#> differ from the effects of B?" In other words, can we be confident
#> about the direction from A to B? Is it "up", "down" or "uncertain"?
#> --- John Tukey, The Philosophy of Multiple Comparisons, Statistical
#> Science, 6, 100-116.
#>
#> No one has ever shown that he or she had a free lunch. Here, of course,
#> "free lunch" means "usefulness of a model that is locally easy to make
#> inferences from".
#> --- John Tukey, Issues relevant to an honest account of data-based
#> inference, partially in the light of Laurie Davies' paper.
#>
#> If asymptotics are of any real value, it must be because they teach us
#> something useful in finite samples. I wish I knew how to be sure when
#> this happens.
#> --- John Tukey, Issues relevant to an honest account of data-based
#> inference, partially in the light of Laurie Davies' paper.
#>
#> George Box: We don't need robust methods. A good statistician
#> (particularly a Bayesian one) will model the data well and find the
#> outliers. John Tukey: They ran over 2000 statistical analyses at
#> Rothamsted last week and nobody noticed anything. A red light warning
#> would be most helpful.
#> --- George Box vs. John Tukey, Douglas Martin, 1999 S-Plus Conference
#> Proceedings.
#>
#> Statistics is a science in my opinion, and it is no more a branch of
#> mathematics than are physics, chemistry, and economics; for if its
#> methods fail the test of experience--not the test of logic--they will
#> be discarded.
#> --- John Tukey, The life and professional contributions of John W.
#> Tukey, by David Brillinger, The Annals of Statistics, 2001, Vol 30.
#>
#> One Christmas Tukey gave his students books of crossword puzzles as
#> presents. Upon examining the books the students found that Tukey had
#> removed the puzzle answers and had replaced them with words of the
#> sense: "Doing statistics is like doing crosswords except that one
#> cannot know for sure whether one has found the solution."
#> --- John Tukey, The life and professional contributions of John W.
#> Tukey, by David Brillinger, The Annals of Statistics, 2001, Vol 30, p.
#> 22.
#>
#> A sort of question that is inevitable is: "Someone taught my students
#> exploratory, and now (boo hoo) they want me to tell them how to assess
#> significance or confidence for all these unusual functions of the data.
#> Oh, what can we do?" To this there is an easy answer: TEACH them the
#> JACKKNIFE.
#> --- John Tukey, We Need Both Exploratory and Confirmatory, The American
#> Statistician, Vol 34, No 1, p. 25.
#>
#> John Tukey's eye for detail was amazing. When we were preparing some of
#> the material for our book (which was published last year), it was most
#> disconcerting to have him glance at the data and question one value out
#> of several thousand points. Of course, he was correct and I had missed
#> identifying this anomaly.
#> --- Kaye Basford
#>
#> Many students are curious about the '1.5 x IQR Rule';, i.e. why do we
#> use Q1 - 1.5 x IQR (or Q3 + 1.5 x IQR) as the value for deciding if a
#> data value is classified as an outlier? Paul Velleman, a statistician
#> at Cornell University, was a student of John Tukey, who invented the
#> boxplot and the 1.5 x IQR Rule. When he asked Tukey, 'Why 1.5?', Tukey
#> answered, 'Because 1 is too small and 2 is too large.' [Assuming a
#> Gaussian distribution, about 1 value in 100 would be an outlier. Using
#> 2 x IQR would lead to 1 value in 1000 being an outlier.]
#> --- Unknown
#>
#> It is a rare thing that a specific body of data tells us as clearly as
#> we would wish how it itself should be analyzed.
#> --- John Tukey, Exploratory Data Analysis, p. 397.
#>
#> Just which robust/resistant methods you use is not important--what is
#> important is that you use some. It is perfectly proper to use both
#> classical and robust/resistant methods routinely, and only worry when
#> they differ enough to matter. But, when they differ, you should think
#> hard.
#> --- John Tukey, Quoted by Doug Martin
#>
#> I believe that there are many classes of problems where Bayesian
#> analyses are reasonable, mainly classes with which I have little
#> acquaintance.
#> --- John Tukey, The life and professional contributions of John W.
#> Tukey, The Annals of Statistics, 2001, Vol 30, p. 45.
#>
#> The twin assumptions of normality of distribution and homogeneity of
#> variance are not ever exactly fulfilled in practice, and often they do
#> not even hold to a good approximation.
#> --- John W. Tukey, The problem of multiple comparisons. 1973.
#> Unpublished manuscript, Dept. of Statistics, Princeton University.
#>
#> [A]sking 'Are the effects different?' is foolish.
#> --- John W. Tukey, The philosophy of multiple comparisons. 1991.
#> Statistical Science 6 : 100-116.
#>
#> Empirical knowledge is always fuzzy! And theoretical knowledge, like
#> all the laws of physics, as of today's date, is always wrong-in detail,
#> though possibly providing some very good approximations indeed.
#> --- John W. Tukey, The philosophy of multiple comparisons. 1991.
#> Statistical Science 6 : 100-116.
#>
#> If significance tests are required for still larger samples, graphical
#> accuracy is insufficient, and arithmetical methods are advised. A word
#> to the wise is in order here, however. Almost never does it make sense
#> to use exact binomial significance tests on such data - for the
#> inevitable small deviations from the mathematical model of independence
#> and constant split have piled up to such an extent that the binomial
#> variability is deeply buried and unnoticeable. Graphical treatment of
#> such large samples may still be worthwhile because it brings the
#> results more vividly to the eye.
#> --- Frederick Mosteller & John W Tukey, The Uses and Usefulness of
#> Binomial Probability Paper, Journal of the American Statistical
#> Association 44, 1949.
#>
#> If we need a short suggestion of what exploratory data analysis is, I
#> would suggest that: 1. it is an attitude, AND 2. a flexibility, AND 3.
#> some graph paper (or transparencies, or both).
#> --- John W. Tukey, Jones, L. V. (Ed.). (1986). The collected works of
#> John W. Tukey: Philosophy and principles of data analysis 1949-1964
#> (Vols. III & IV). London: Chapman & Hall.
#>
#> Three of the main strategies of data analysis are: 1. graphical
#> presentation. 2. provision of flexibility in viewpoint and in
#> facilities, 3. intensive search for parsimony and simplicity.
#> --- John W. Tukey, Jones, L. V. (Ed.). (1986). The collected works of
#> John W. Tukey: Philosophy and principles of data analysis 1949-1964
#> (Vols. III & IV). London: Chapman & Hall.
# to a data.frame
out <- search_quotes("bad data", fuzzy = TRUE)
as.data.frame(out)
#> qid
#> 255 255
#> text
#> 255 Bad data makes bad models. Bad models instruct people to make ineffective or harmful interventions. Those bad interventions produce more bad data, which is fed into more bad models.
#> source cite
#> 255 Cory Doctorow Machine Learning's Crumbling Foundations, Aug 2021.
#> url
#> 255 https://onezero.medium.com/machine-learnings-crumbling-foundations-bd11efa22b0
#> tags tex
#> 255 data <NA>
search_text("omnibus")
#>
#> Work by Bickel, Ritov, and Stoker (2001) shows that goodness-of-fit
#> tests have very little power unless the direction of the alternative is
#> precisely specified. The implication is that omnibus goodness-of-fit
#> tests, which test in many directions simultaneously, have little power,
#> and will not reject until the lack of fit is extreme.
#> --- Leo Breiman, Statistical Modeling: The Two Cultures, Statistical
#> Science, Vol 16, p. 203.
qdb <- get_quotes()
nrow(qdb)
#> [1] 711
names(qdb)
#> [1] "qid" "text" "source" "cite" "url" "tags" "tex"