Topic: Computer science (Page 2)

You are looking at all articles with the topic "Computer science". We found 120 matches.

Hint: To view all topics, click here. Too see the most popular topics, click here instead.

πŸ”— Levenshtein Distance

πŸ”— Computer science

In information theory, linguistics and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.

Levenshtein distance may also be referred to as edit distance, although that term may also denote a larger family of distance metrics known collectively as edit distance. It is closely related to pairwise string alignments.

Discussed on

πŸ”— CRDT: Conflict-free replicated data type

πŸ”— Computer science

In distributed computing, a conflict-free replicated data type (CRDT) is a data structure which can be replicated across multiple computers in a network, where the replicas can be updated independently and concurrently without coordination between the replicas, and where it is always mathematically possible to resolve inconsistencies which might result.

The CRDT concept was formally defined in 2011 by Marc Shapiro, Nuno Preguiça, Carlos Baquero and Marek Zawirski. Development was initially motivated by collaborative text editing and mobile computing. CRDTs have also been used in online chat systems, online gambling, and in the SoundCloud audio distribution platform. The NoSQL distributed databases Redis, Riak and Cosmos DB have CRDT data types.

Discussed on

πŸ”— Jaccard Index

πŸ”— Computer science πŸ”— Statistics

The Jaccard index, also known as the Jaccard similarity coefficient, is a statistic used for gauging the similarity and diversity of sample sets. It was developed by Grove Karl Gilbert in 1884 as his ratio of verification (v) and now is frequently referred to as the Critical Success Index in meteorology. It was later developed independently by Paul Jaccard, originally giving the French name coefficient de communautΓ©, and independently formulated again by T. Tanimoto. Thus, the Tanimoto index or Tanimoto coefficient are also used in some fields. However, they are identical in generally taking the ratio of Intersection over Union. The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets:

J ( A , B ) = | A ∩ B | | A βˆͺ B | = | A ∩ B | | A | + | B | βˆ’ | A ∩ B | . {\displaystyle J(A,B)={{|A\cap B|} \over {|A\cup B|}}={{|A\cap B|} \over {|A|+|B|-|A\cap B|}}.}

Note that by design, 0 ≀ J ( A , B ) ≀ 1. {\displaystyle 0\leq J(A,B)\leq 1.} If A intersection B is empty, then J(A,B)Β =Β 0. The Jaccard coefficient is widely used in computer science, ecology, genomics, and other sciences, where binary or binarized data are used. Both the exact solution and approximation methods are available for hypothesis testing with the Jaccard coefficient.

Jaccard similarity also applies to bags, i.e., Multisets. This has a similar formula, but the symbols mean bag intersection and bag sum (not union). The maximum value is 1/2.

J ( A , B ) = | A ∩ B | | A ⊎ B | = | A ∩ B | | A | + | B | . {\displaystyle J(A,B)={{|A\cap B|} \over {|A\uplus B|}}={{|A\cap B|} \over {|A|+|B|}}.}

The Jaccard distance, which measures dissimilarity between sample sets, is complementary to the Jaccard coefficient and is obtained by subtracting the Jaccard coefficient from 1, or, equivalently, by dividing the difference of the sizes of the union and the intersection of two sets by the size of the union:

d J ( A , B ) = 1 βˆ’ J ( A , B ) = | A βˆͺ B | βˆ’ | A ∩ B | | A βˆͺ B | . {\displaystyle d_{J}(A,B)=1-J(A,B)={{|A\cup B|-|A\cap B|} \over |A\cup B|}.}

An alternative interpretation of the Jaccard distance is as the ratio of the size of the symmetric difference A β–³ B = ( A βˆͺ B ) βˆ’ ( A ∩ B ) {\displaystyle A\triangle B=(A\cup B)-(A\cap B)} to the union. Jaccard distance is commonly used to calculate an n Γ— n matrix for clustering and multidimensional scaling of n sample sets.

This distance is a metric on the collection of all finite sets.

There is also a version of the Jaccard distance for measures, including probability measures. If ΞΌ {\displaystyle \mu } is a measure on a measurable space X {\displaystyle X} , then we define the Jaccard coefficient by

J ΞΌ ( A , B ) = ΞΌ ( A ∩ B ) ΞΌ ( A βˆͺ B ) , {\displaystyle J_{\mu }(A,B)={{\mu (A\cap B)} \over {\mu (A\cup B)}},}

and the Jaccard distance by

d ΞΌ ( A , B ) = 1 βˆ’ J ΞΌ ( A , B ) = ΞΌ ( A β–³ B ) ΞΌ ( A βˆͺ B ) . {\displaystyle d_{\mu }(A,B)=1-J_{\mu }(A,B)={{\mu (A\triangle B)} \over {\mu (A\cup B)}}.}

Care must be taken if ΞΌ ( A βˆͺ B ) = 0 {\displaystyle \mu (A\cup B)=0} or ∞ {\displaystyle \infty } , since these formulas are not well defined in these cases.

The MinHash min-wise independent permutations locality sensitive hashing scheme may be used to efficiently compute an accurate estimate of the Jaccard similarity coefficient of pairs of sets, where each set is represented by a constant-sized signature derived from the minimum values of a hash function.

Discussed on

πŸ”— Stochastic Parrot

πŸ”— Computer science πŸ”— Philosophy πŸ”— Philosophy/Contemporary philosophy πŸ”— Philosophy/Philosophy of mind πŸ”— Artificial Intelligence

In machine learning, "stochastic parrot" is a term coined by Emily M. Bender in the 2021 artificial intelligence research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" by Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell. The term refers to "large language models that are impressive in their ability to generate realistic-sounding language but ultimately do not truly understand the meaning of the language they are processing."

Discussed on

πŸ”— One Instruction Set Computer

πŸ”— Computing πŸ”— Computer science

A one-instruction set computer (OISC), sometimes called an ultimate reduced instruction set computer (URISC), is an abstract machine that uses only one instruction – obviating the need for a machine language opcode. With a judicious choice for the single instruction and given infinite resources, an OISC is capable of being a universal computer in the same manner as traditional computers that have multiple instructions. OISCs have been recommended as aids in teaching computer architecture and have been used as computational models in structural computing research.

Discussed on

πŸ”— Soundex – a phonetic algorithm for indexing names by sound

πŸ”— Computer science πŸ”— Linguistics πŸ”— Linguistics/Applied Linguistics

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling. The algorithm mainly encodes consonants; a vowel will not be encoded unless it is the first letter. Soundex is the most widely known of all phonetic algorithms (in part because it is a standard feature of popular database software such as DB2, PostgreSQL, MySQL, SQLite, Ingres, MS SQL Server and Oracle.) Improvements to Soundex are the basis for many modern phonetic algorithms.

Discussed on

πŸ”— Duffs Device

πŸ”— Computing πŸ”— Computer science πŸ”— C/C++ πŸ”— C/C++/C

In the C programming language, Duff's device is a way of manually implementing loop unrolling by interleaving two syntactic constructs of C: the do-while loop and a switch statement. Its discovery is credited to Tom Duff in November 1983, when Duff was working for Lucasfilm and used it to speed up a real-time animation program.

Loop unrolling attempts to reduce the overhead of conditional branching needed to check whether a loop is done, by executing a batch of loop bodies per iteration. To handle cases where the number of iterations is not divisible by the unrolled-loop increments, a common technique among assembly language programmers is to jump directly into the middle of the unrolled loop body to handle the remainder. Duff implemented this technique in C by using C's case label fall-through feature to jump into the unrolled body.

Discussed on

πŸ”— Negative Base

πŸ”— Computer science πŸ”— Mathematics

A negative base (or negative radix) may be used to construct a non-standard positional numeral system. Like other place-value systems, each position holds multiples of the appropriate power of the system's base; but that base is negativeβ€”that is to say, the base b is equal to βˆ’r for some natural number r (r β‰₯ 2).

Negative-base systems can accommodate all the same numbers as standard place-value systems, but both positive and negative numbers are represented without the use of a minus sign (or, in computer representation, a sign bit); this advantage is countered by an increased complexity of arithmetic operations. The need to store the information normally contained by a negative sign often results in a negative-base number being one digit longer than its positive-base equivalent.

The common names for negative-base positional numeral systems are formed by prefixing nega- to the name of the corresponding positive-base system; for example, negadecimal (base βˆ’10) corresponds to decimal (base 10), negabinary (base βˆ’2) to binary (base 2), negaternary (base βˆ’3) to ternary (base 3), and negaquaternary (base βˆ’4) to quaternary (base 4).

Discussed on

πŸ”— Dennis Ritchie

πŸ”— Biography πŸ”— Computing πŸ”— Computer science πŸ”— Biography/science and academia πŸ”— New York (state) πŸ”— New York (state)/Hudson Valley πŸ”— Computing/Computer science πŸ”— Software πŸ”— Software/Computing πŸ”— C/C++ πŸ”— Japan πŸ”— New Jersey πŸ”— Linux

Dennis MacAlistair Ritchie (September 9, 1941 – c. October 12, 2011) was an American computer scientist. He created the C programming language and, with long-time colleague Ken Thompson, the Unix operating system and B programming language. Ritchie and Thompson were awarded the Turing Award from the ACM in 1983, the Hamming Medal from the IEEE in 1990 and the National Medal of Technology from President Bill Clinton in 1999. Ritchie was the head of Lucent Technologies System Software Research Department when he retired in 2007. He was the "R" in K&R C, and commonly known by his username dmr.

Discussed on

πŸ”— Reversible computing

πŸ”— Technology πŸ”— Computing πŸ”— Computer science

Reversible computing is a model of computing where the computational process to some extent is time-reversible. In a model of computation that uses deterministic transitions from one state of the abstract machine to another, a necessary condition for reversibility is that the relation of the mapping from (nonzero-probability) states to their successors must be one-to-one. Reversible computing is a form of unconventional computing.

Discussed on