# Intelligent Systems - Summary Notes [TOC] ## 1. Intelligence vs Autonomy What is an “Intelligent System”? ### What is Intelligence? -No common definition agreed, moving target ideas for defining it: - Memory - Decision-Making - Creativity - Awareness - Common Sense - Self-reflection - Ability to learn - Handle new situations categories of defintions for it: - social - emotional - senso-motoric: eg great tennis player can do cool tricks with a tennis ball - mental => human intelligence as starting point #### What is Turing Defintion of Intelligence? If we can not **distinguish** what a ****computer**** does from what an **human** does then the computer is intelligent. => we are still using this def in the tech fields. E.G. AlphaGO Turing++ weak AI (mimic of intelligence) >< strong AI (thinks) ### What are the different AI perspectives? #### Knowledge - based AI Intelligence might be able to remember many things and process on them to output knowledge. + manipulate symbols (understand a lngg) #### Behavior-based AI Intelligence from elementary behaviors. EG navigate a maze, grasping an object. **Senso-motoric** intelligence Physical grounding hypothesis. similar grounding #### Connectionism(neural network) Intelligence is in the **connection** between the small units (neurons) ->ANN #### Distributed AI **Interaction** = Intelligence. Interaction at signal lvl is connectionism >< Interation a kwlg lvl (**higher** lvl) = Distributed AI (+)**robustness**: agents can handle each other's tasks ## 2. Expert Systems How to construct an Intelligent System? ### What are they? _DEF:_ a computer system that emulates the **decision-making** ablity of an human expert. EG Build a modecular structure based on spectral graph. We build it by extracting knowledge from expert in order to implement it in a system = Knowledge Engineering, close to Expert System It is popular because proven to work and provable (mathematically) >< machine learning -> trustworthy #### Comparison with other systems Expert Systems = ancestor of Intelligent Systems Intelligent systems: +autonomous, +interaction with the world **Decision Support System**. goal is to provide **info**, **tools** to help the user to make a decision. >< E.S. suggests a decision to the user, data reasoning ### How they work? #### Components - non-expert **user** - **inference engine** used to handles fact, handle queries from the users. Fixed part. - **rules** how the environment works if-then statements. Gathered from experts - specific **facts** to a specific query data about specific cases, Gathered from experts by **law**, E.S. are obliged to provide an argument to their recommendation/decision Interfacing must be clear --- #### Probabilistic reasoning ##### Bayes Rules for classification **Bayesian inference**: the main type of reasoning technique applied to probabilistic E.S. (vs non-probabilistic) **Maginalizing** = probility of one event, in set of value to happen **Maximising** = finding the most likely event to happen (most probable) ->Bayes Rules uses this (Given attribute values, what is most probable value of target/class variable?) --- Naive Bayes = all events are independents Not making this assumption can be very costy -> midway solutions for joint probablities * Factor Graph: Some events are dependent (those who are obvious to be dependent), some are not $\theta$s = potential function $\theta(A,B)$ = indenpent contribution **factor**, called "cliques" , A,B = rnd variable * Markov network (undirected) we have **cliques** of events which are dependent Z = normalisation constant, because all probs sums to >1 * Bayesian network (directed) arrows are used to define how kind of **causality** we have see Reasoning Techniques course --- Inference there elements wich are - evidence (observed) - non-observed - elements which we want to make a prediction Variable elimination eliminate vars, using the evidences SotA: Adding prob to logic (state of the art) --- ProbLog $p: q{t_1,..,t_n}:-$ body restricting logical variable - p is a probability value (0 easier because we don t have to deal with joint probs ### ES Advantages (vs Human) - **consistence** (always same decisions) - **memory** - **logic** (no sentimental/ biased decisions) - **reproductible** (does not get tired) ### ES Disadvantages - Does not have **common sense** -> need to program it and can be hard - Lack of **Creativity** - **Maintenance** of knowledge is manual - Not **adaptable** to new purposes, new conditions ## 3. Agent Oriented Design ## 3.1 The Agent Concept _DEF:_ Entity **situtated** in some environment and that is capable of **flexible**, **autonomous** activity Minimally Intelligent Agent - **pro-active** (taking initative) - **reactive** (preceive and react to the env.) - **social ability** (interacting with humans, capable of cooperating, negotiating) ### Objects vs Agents both have - **indentiy** (id) - **state** (what) - **passive** behavior (action done if invoked) agent ++ - **active** behavior, more complex, threaded - must be defined: when, why, with whom, whether at all ### Expert Systems vs Agents e.s. - is not **embodied** - don t interact with **environment** - is more **passive** (only answers questions) ### Environment types * (in)accessible -> lvl of info that is accessible * (non-)deterministic -> accurate, certain of action effect * (non-)episodic -> repeated cyle, reseting the env * (dynamic) static -> agent is the only cause for change, env is not moving * (cont) discrete -> fixed nb of actions ### Architectures types -Those architectures can be combined- - lobic based -> (symbolic) knw based AI view //intelligence kn (e.g. Planning agents) components: - **theory of agency**: how agent behave given a context (fixed) - **belief database**: info on enviromnent - **actions**: chose by theory of agency based on belief database (-) how to **translate** perceptions into symbolic facts (-) reasoning takes **time** (-) theory of agency for a **complex** environment (+) transaltion is becoming efficient (DL for vision) (+) hardware is becoming fast (reasoning takes less time) (+) logical policies can be learned - **reactive agents** -> behavioral, situated //intelligence from interaction with environment situation(state) -> action create it by: 1) defining a set of **behaviors** accomplishing some tasks 2) subsumption **hierarchy** (+) simple + fast (+) robust against failure (+) elegant (+) trackable (-) only local info is used (**stateless**) (-) does not know of past/**future + space** (-) hard to **predict** if the system will work, be intelligent (-) dynamics of **interaction** can be hard to model into a subsumption hierarchy - belief/desire/intention (**BDI**) agents **deliberation**: what goals we (currently) want to achieve? **mean-ends reasoning**: what to do to achieve them? **Intentions**: model and decide **short-term goals**, used to focus attention (on a small goal rather then the entire goal) agent must think: are they still usefull?, reconsider the intentions? -> **bold** (don t reconsider) vs **cautious** (constantly reconsider) influenced by the rate changing of the world Components: 1) get **believes** (-> about environment ) and update from the previous ones 2) based on it, generate **all possible action** options (=**desires**) 3) **filter** -> get the current intention 4) compute the **return action** to perform based on it - layered **Horizontal**: input -> layer n -> action n } filter/hierarchy to chose which one to take (might be complex) **Vertical**: one pass //neural net two pass (go back and forth in the layers). Each layer = a lvl of abstraction ## 4. Genetic Algorithms How to generate an Intelligent System? ### Evolutionary Algos **What**? optimize a problem using a population influanced by the dynamics of genetic evolution carac: - any-time (stop the process) - any version of the algo could fit the problem) - iterative - stochastic **How?** * initial population needs to be diverse and large (to have enough different genes). **Genotype** = genetics of the population, individuals. **Phenotype** = optimal solution for the problem? It is tested for its performences. * selection **fitness** evaluation if expensive -> limited in the nb of indivs and generations. One solution: pick up indivual representative of each type GA Fitness function requirement: it must be able to evaluated partial solutions. win/loose fitness doesn t work. -> solve by finess = %win types: - roulette wheel p=f/sum(f) + tuning using a different distribution - tournaments + tuning (add a probability of wining) * reproduce ->new popu crossovers single/double_point/uniform mutation bit flipping/ real value mutation --- ### Schema theory - Proof of why evo algo works? **schema**: a subset to the total pop notation: binary nb = fixed pos (*) = random -> e.g.: $\space *1*1$ **defining length** = longest distance between 2 fixed pos **order** = #fixed pos $(1-p_m)^{o(s)}$, $p_m$ is the mutation prob of a single bit,, o(s) = order => prob of schemata to survive a mutation $1-p_c \frac{d(s)}{n-1}$, $p_c$ the crossover probability, d(s) = defining length, n-1 = nb of places where you can cut a genotype => prob to survive a crossover **Schema Theorem**: a schemata with - above average fitness (-> higher prob for survive) - short defining lengths (-> less likely destroyed by cross over) - low order (-> less likely d. by mutation) -> are more likely to survive **Building block theorem**: g.a. are created blocks (of the schema theorem) of high fitness and are recombining them into a better solution ⚠ assumes that a problem is divisible into sub-p., which is not always the possible for complex problems -> genotype design (guidlines) - put related genes **close** to each other (so that they together form a b.b.) - make each gene as **independent** as possible + cooperative coevolution : those independent (group of) genes have indenpent influance on the global fitness (-> make a GA for each group) -> crossover design - set a low prob to **destroy** the schemas (cut in a b.b.) //avoid destroying (good) schemas Elitism: copy the best solutions, and reproduct the rest of the population (mutation + crossover) if all genes are inpendent, we can look indenpendently at each of their influance on the global fitness, to get an optimised phenotype. Do this by making a gene population -> **SNES** - Separable Natural Evolution Strategy separable-> each gene can be inpendently separated natural-> natural(gaussian) distribution of each gene --- #### SNES no more mutation, reproduction, selection just update the means and st.d. according to the **weighted sum** of genomes in the popl **scaled** by **utility** //gradient search for each gene (gene//dimension) **premature convergence** -> we can get to a niche (sub-optimal) solution, if genotypes are not diverse enough mutation is the only way to get out of it -> fix by making multiple runs, or tuning for getting slower convergence -> OR fix by adding a **niche penalty** (when a group radius similar indiv becomes too big, add a fitness penalty to decrease its size) ## 5. Multi Agent Coordination How to construct an Intelligent System? **coordination**: managning **dependencies** btween **activities** --- ### **cooperative system**: different agent's have the same **goal** charac of the cooperative systems, (how easy/ tricky is it to cooperate) are evaluated using - the environment (predictable? diverse?) - cooperation (communication?) - cooperation entities (number? homogeneity?) --- ### **coordination theory** - **goals** indentifications - **activities** --> to achiveve goal - **actors** -> who?which tasks? - **independencies** - prerequisite: an activity must be done before another - shared resource: multiple activities need an unique tool - simultaneity: 2+ activities must happen at the same time coordination is a must **centralization** is **bad** because a single unit is - limited processing power (can not handle exponentially complex problems) - limited memory storage - complex to design (more than a smaller subunit) ### Coordination types --- * Client - Server - client **asks** questions, task to do - server **handles** the client requests (an entity can be both c. and s). (+) simple syncro (-) servers can be killed by too many requests -> multiple servers? no really because it requires syncronisation (increases complexity) * Task and results sharing -> splitting task to solve it more efficiently why? each sub-problem can solved with less knowledge (+) each sub problem requires less knowledge, is less complex (-) each task division requires design and syncro effort * Blackboard every agent share a common **shared knowledge space**. Same **communication format**. But agents are acting inpendently (they don t have to know about each other, as long as they have the format) **robots** have troubles because it is diffuclt to transfer a program from one another (because each robot has its own specifics) -> using blackboards where there are differents **channels** for different **topic**. Each robot works in its relevant channels * Contract Net Protocol = negotiation protocol 1. manager, set the tasks 2. contractor, can do the task, asks for it 3. manager selects the optimal contractor (+)flexible, distributed, dynamic (-) design issues, when the agent would ask for task sharing, when do the task alone, how manager selects the agents? -> think of protocols client server (erery roles are set) < c.n.p < blackboard (everything is free) * Functionally Accurate Cooperation (FA/C) **functionally correct**, each agent communicates only 100% sure info. (it decides locally what makes sense). **cooperation** merge of local correctness ->design issue who to combine inconsistencies (ones say false, the other true)? --- ### Advanced models * Auctioning: types: - **english auction**: first-price open-cry, highest bidder wins - **first-price sealed-bid**: one round, hidden price - **vickrey second-price sealed-bid**: same as above but winner pays the second higher price - **dutch **(descending): stops when one says "i ll take it" values: - **private** value auction: the value is set by the bidder's preferences (art) - **common-value** auction: max price is common sense - **correlated** value auction: top two are involved bidder types: - **risk-averse bidder**: willing to pay more because they want it. Best for **dutch** and **f-p s-b** (higher prices) - **risk-averse auctioneer**: willing to sell a bit lower than expected: better for vickrey (higher sold price/ english) - **risk-neutral**: price of auction doesn t matter, exp sold price will be the same * Voting -> used to solve the goal coordinating **irrelevant alternative**: there is a voting, A wins over B +option C now B wins over A because some voters voted C **binary protocol**: all decision are made for 2 options could solve i.a. but the **order** of decisions biases the final result **borda protocol**: all voters **rank** the options ->still not solving i.a (no solution today) * (Negotiation) introduces more action then sub, set a task and best agent takes it. * Joint Planning move from several indivuals plans to a single coordinated one. **Design issues**: nobody ever gets stuck?, is it optimal? single planer with multiple executer (boss-employees)/ multiple planers with single executer (different requests, merge them)/ multiple planers with multiple executers relationship among plans (+) equality (plans are equivalent) (+) subsumption (A B easier) (-) resource conflict (-) incompatibility (exe A -> B is not feasible) * Partial Global Planning it is diffcult the design the perfect joint planning, taking everything into account. // FA/C version of Joint Planning: locally decided what must be done in cooperation components: - objective: goal - plan-activity map: what are the parties doing - solution-construction-graph: about interactions, commu - status: project state info (-) some parts are done locally -> not every thing is visible (eg resource allocation, state incompatiblity) * Commitments all the coordination stuff can be replaced by commitments and conventions = agreement to do smth (eg ny resolutions, promise) precommitment = c. involved in the future leveled c. = can cancel, but under penalty operation types: - create - discharge: c. is done - cancel - release: agreed to remove - delegate: he will do it for you - assign: do it for him * Conventions = description of how you can reconsider your commitments = rules challenges they can decrease in reliability (commitments might not been taken or released) -> find balance between too tought/light conventions social c. inform other agents about our c.? -> centrality hypothesis of c. & c. coordination = commitments + conventions (& social) ## 6. Multi Agent Learning (1/2) How to generate an Intelligent System? -> use **learning** because we can not design every agent's behavior (in Multi agent systems) RL: * **Policy** $\pi$ = mapping states to actions. Can be non-deterimistic. (we get a probability to do each action) * **Exploration** = explore the world to generate xp = gather new info vs **Expoitation** = exploiting what have already been learned * **Value function** V (for a state), Q(for a state+ action) = immediate rewards are translated into them (to get an) * **Reward** r = direct feedback to tell if the agent is doing good. Can be non-deterimistic * **$\gamma$** = optimal policy * **disccount factor** $\gamma$?= how much a reward loses its value over time ### Single agent RL: there is * set of **states** * set of **actions** * **transition** f (if apply a in s, which s do you get in?). Can be non-deterimistic * **reward** f -> find * $\pi$ s.t. $V^{\pi}(s_t) = \sum \gamma^i r_{t+i}$ where gamma = discount factor (r goes down as we predict future states) -> assumes that probability distributions are stationary: consquences of agents' actions (probabilities) do not change over time **Bellman equation**: (important) $V_{i+1}(s) = r(s,\pi(s)) + \gamma V_i(\delta(s,\pi (s)))$, $\delta$ transition function Value of a state = **reward** from this **current** state + value from the next **future** states (given by $\pi$) Policy iteration: iterate until the polciy converges, (then you the optimal p.) each iteration: take the max of reward+future states -> rewards and next state values might be unknown -> thus we define a Model-Free using (state-action) Q-values (+) transition and reward function are not need the find the optimal policy, just q-values table wait long enough until (max) q-values converges Difference V and Q: **V**: policy $\pi$ which tells do that action(s) **then** compute the discounted reward (=V) **Q** =(s+a). compute all of them for s **then** update $\pi$ according to it. V = max(Q) ### In none-deterministic env. at each iteration, we add to the current estimate (Q-value) $Q(s,a)+ = \alpha.[r+\gamma max_a Q(s',a') - Q(s-a)]$ $\alpha$ = learning rate param {0,1}: small = low convergence (..-..)first part is what the agent observes, second what he believed he would observe. Together, it is the temporal difference error. ### Exploit vs Explor - Epsilon-Greedy. if x< epsilon -> explore else exploit epsilon decay over time (less exploration over time) - Boltzman Exploration temperature//epsilon? probability of each action to be played is given by then q-values' size ### Multi-agents: -> must take into account other agents' actions in RL difficulties: - exponential growth of states (combination of all agents') - rewards are shared or depends on the other agents try shared rewards based on agents' contribution to win/loose in classical RL-> find which action led to a reward? in MAL-> ++which action of which agents led to a reward? when competing agents, they need to *NOT* have a deterministic policy (e.g. always same moves in chess is easy to beat) ->Best response policy compute the best (optimal) response to other agents strategies. ->If all agents do best response, -> nash equilibrium (there can be multiple ones) **nash equilibrium** might not be optimal because each agent are think locally. Pareto optimality two optimals: one where one agent is doing good, the other is doing good. ### Non-stationary env. actions rewards are determined by the actions of all agents agents are learning simultaneously -> best policy is a moving target because agents are adapting to the other agents' strategies **normal form game** = e.g. prisoner dilemma can be - (a)symmetric (prisoner/ battle of sexes) rewards are the same relative to their action - zero-sum game (HT) ### Learning Automata carac: - policy iteration: adapt its policy so that it converges to an optimal - works in stateless env. if not -> make one LA per state - finite action set (originally) - uses immediate r to update action prob technically: vector containing each action at each iteration, increase the value of the action that got good rewards, decrease the other actions values $\alpha$ $\beta$ the learning params. reward, and penalty param If they are the same, we have linear reward penalty If $\beta = 0$, linear reward inaction If $\alpha >> \beta$, linear-$\epsilon$ penalty If $\alpha =1$ & $\beta=0$, cross learning (only rewards are learned, actions that are taken become even more likely to be retaken, all actions take were not taken get penalized) rewards: ->r is $\in {0,1}$ in env. state based r is average of rewards since last time this state was visited ### Regret Minimization carac: - stateless - after have taken the action, we would know what best reward could have been (e.g. after taking an exam). Instead of r -> best estimated r - r got. ->try to minize the losses ### Joint Learning Naive - do not take other agents into account? -> might work but nahh ### Joint action Learning 1) try to model oppenents/team mem's action proba, by making frequency count 2) learn Q-values for joint actions -> expected V(a) = Sum Q-vals per each expected other agent actions* the prob that it do it JL (+) can create cooperation (-) have to store many probs (-+) assumes that observing all agents' actions is possible -> if can not, have to guess on the other agents' policies ## 6. Multi Agent Learning (2/2) there is a need to compute good Q-values which mimic immediate rewards ### Value iteration Q(s,a,o) where o is other agent V(s) is max min $\pi$*Q. Min because opponent will choose to minimize our reward expected payoff of a game = sum prob action agent1.immediate r.action agent2 -> use this as immitation for immediate reward -> used for MiniMax Qlearing ### MiniMax Qlearning initialisation: initialise all q vals at 1 initialise all state action probs uniformly choose an action: using epsilon greedy choose action a according to current policy learning: - get immediate r from a+o (opponent's action) - update Q val based on the immediate got (make a weighted avg of new/old Q) - update the policy (change the values of the states) based on maxmin action properties: - designed for **0 sum** game - designed to take the **safest** strategy (against worst opp.) - **linear solving** could make the algo slow ### Nash QLearning properties: - handle **General sum games** (zero sum++ -> improvement on Minimax) - Nash(Optimal) Q-vals = Q-vals of a N.Eq. they are computed using every agents' qvalues (->assuming all agents are also picking NE actions) - agents have to be able to observe other agents and their respective r (or estimate them) - computing a NE /agents /each turn is heavy computations (> linear program of minimax) - low convergence guarantees (long restrictions) #### Nash Q-values defined on expeted sum of r if all agents use the NE strategy **multiple** NEq? - pick one of them randomly #### Implementation set all Q-values at 0 each agents learns all Q-table of every agents for all agents update current estimate with new estimate new estimate is $r_j^t + \gamma \space NashQ_j^t(s')$ $NashQ_j^t(s')$ = value of the state s for agent j, if all agents play NE strategy = $\pi_1(s')..\pi_m(s'). Q_j^t(s')$ where $\pi_i$ are the action probabilities (policy) of each agents (taking into account their respective NE policy) multiplied by the huge Qtable ### Friend or Foe Q learning nash qlearning convergence guarantees are given in only specitic cases -> FoF handles that FoF works in cooperative and zero sum games only idea: if competitive game -> use mini max if cooperative -> use max(joint actions) -> get rid of calulating the NE ### Dynamics of M-A Learning study Evolutionary Game Theory = how agents are going to evole over **time**. $\Delta x(t) = x(t+1)-x(t)$ and $\Delta \rightarrow 0$ where x is the actions probs. -> what is the expected change that is going to happen over time? #### Cross learning Expected $E[\Delta xi] = x_i[f_i -f_i x_i] +$ update $\sum_{i \neq j} x_j[-f_j x_i]$ $x_i$ = prob of playing action i $f_i$ = expected reward: = $e_iRy^T$, = reward for row player playing action $i$, facing all actions of col player $y^T$ is the prob distribution of opponents action = $x_i [e_iRy^T - \sum_j x_j e_iRy^T]$ = $x_i [e_iRy^T - \sum_j x Ry^T]$ = expect value change of policies for a single (small!) step = derivative $\frac{\partial d x_i}{\partial t}$ #### Frequency Adjusted QLearning we added to normalise our experienced estimate + $\alpha$ the observed estimate. -> problem is that action that we do not pick get smaller chance to be picked. Q-values change doesn t only depend on optained but also on **prob distribution** -> try to decrease the infulance of the prob by using -> Frequency Adjusted QLearning change $x_i(t)$ by $\frac{1}{x_i(t)}$ -> problem: for really small probs, we get too large vals ->solve with introducing second learning param $\beta$ if prob $x_i < \beta$, use 1 (no normalising) else, use $\frac{\beta}{x_i(t)}$ ### Gradient Ascent Learning V(x) is a policy -> $x_i(t+1) = x_i(t) + \delta x_i(t)$ ## 7. Deep Learning for Agents Deep network: each layer transforms the input into a more suitable machine readable file in order to accomplish its task (classify) **Sparse Coding** - used so that computers come up with their own representation of the data $\phi$ is computer's representation space $\alpha$ the weight x the data x - a.$\phi$ the encoding of the data using the represantation basis of the computer add regularisation term -> makes max# weight to 0 **Autoencoders** Middle layer = bottleneck, zip input () to unzip in the outup. Application: imputation, inpainting, denoising **Stacked autoencoders**: concat of pretrained auto-e. **feature discovery**: using an autoencoder => better representation for a classification **convolutional filter** //IVP **CNN** apply node that is connected to a sub image -> **local filter** **max pooling** = transer the max of several nodes the next layer node -> **global filter** ->175 billion parameters for the biggest CNN, not really efficient for training ### AlphaGo State is fully observable, deterministic, discrete **Planning and search** -> impossible because there are **too many states** -> so, need to limit width and **depth** of the seach tree using MCTS: first step) width: do not explore every move -> ML classifier (CNN) -> good move/ bad move. Train data set = move that experience players make/ do not make height: need evaluation function Train NN Training NN: Training NN // gradient descent to find the optimal params. gradient descent is a function -> calculate the derivative of the functions values must be {0,1} otherwise (logistic function), the multiplication will explode and backpropagation error is impossible second step) reainforcement learning, ai 1 vs ai 2 -> increase the likelyhood of actions that led to a win by multiplying with de ML classifier's output ### Alpha Zero - **merging** of classifier that predicts a good move and classifer that evaluates the state - remove **MCTS** (that required param tuning), to make it simpler Then let reinforcement learning do the rest => less accuracy in human move prediction -> ai founds its own way to play, it became less revelevant. Perhaps humans try to win of the most/ ai are fine **wining barely**? => but more accuracy in state of the game, who' s gonna win. ### Comparaison with robotics alpha zero is super perfomant for a very specific task: a game of 200 moves, only 19x19 image, dicrete, deterministic and fully ob. -> this environment is very rare for robots -> outside env. is even more difficult to handle #### Starcraft: game with complex environment use of **belive states**: past states//now//guess future states use of **recurrent neural networks**: have a memory use of **attention model**: -> track which part of the history should be paid attention to then use relational RL -> it works but not impressive since agents have the plays equiv of 200y of gaming //human nothing ## 8. Reaching Human Level Intelligence Examples of Intelligent Systems --- Aritifical General Intelligence? -> using evolution? **simulate evolution** (because blind e. gave rise to intelligent systems, us). (-) maybe evolution was **lucky** (+ narrow path to intelligence) and we will not be able to reproduce what it has done. (-) **natural** i. is different to a. i. (does submarines swim?) (-) maybe not enough **computation power** // #neurons in human brain ### Brain emulation (reproduction) lvls: **brain** (//functionality) - **mind** (consciousness?) - **person** (personality) emulation lvls degree of complexity and details of which we have to go to model human brain -> which tech do we have to develop? which are b.e. specific types of techs required : * Scanning: (imaging) scan brain to understand how it works * Translation: (scan interpertation) from brain activity to computational activity * Neural Simulation * Environment Simulation? in which env. should the brain live? virtual? does the brain need a body? Seed-AI -> ai boostrapping with learning. Why not create a **baby** mind that will learn? Current research isn t working on making GAI